Difference between revisions of "Retrieving Content From the Web"

(Commentary about being obsolete)
(WebinarArchive to subdomain)
 
Line 9: Line 9:
 
The example makes use of an external program, written in C++/CLR using Microsoft Visual Studio 2005.  The example thus demonstrates the integration of an external program with Analytica.  The source code and steps for creating this program, [[media:ReadURL.exe | ReadURL.exe]], are given below.  If you don't have Microsoft Visual Studio 2005, or don't wish to compile [[media:ReadURL.exe | ReadURL.exe]] yourself, you can download it from this link: [[media:ReadURL.exe | ReadURL.exe]]
 
The example makes use of an external program, written in C++/CLR using Microsoft Visual Studio 2005.  The example thus demonstrates the integration of an external program with Analytica.  The source code and steps for creating this program, [[media:ReadURL.exe | ReadURL.exe]], are given below.  If you don't have Microsoft Visual Studio 2005, or don't wish to compile [[media:ReadURL.exe | ReadURL.exe]] yourself, you can download it from this link: [[media:ReadURL.exe | ReadURL.exe]]
  
A webinar based on the content of this article can be viewed at: [http://analyticaonline.com/WebinarArchive/2007-10-18-Calling-External-Applications.wmv Calling-External-Applications.wmv] (requires Windows Media Player)
+
A webinar based on the content of this article can be viewed at: [http://WebinarArchive.analytica.com/2007-10-18-Calling-External-Applications.wmv Calling-External-Applications.wmv] (requires Windows Media Player)
  
 
== Minimal Requirements ==
 
== Minimal Requirements ==

Latest revision as of 21:54, 6 November 2019


This article was created in 2007, and is now obsolete. The entire functionality developed in this article can today be carried out just by calling built-in functions ReadURL and ParseCSV. Nevertheless, if you are developig your own custom data or application integration, the development carried out in the article may still be relevant for you.

This page details a simple example of integrating external data into Analytica. In this example, historical stock data for a given stock is read from a web site and imported into an Analytica model.

The example makes use of an external program, written in C++/CLR using Microsoft Visual Studio 2005. The example thus demonstrates the integration of an external program with Analytica. The source code and steps for creating this program, ReadURL.exe, are given below. If you don't have Microsoft Visual Studio 2005, or don't wish to compile ReadURL.exe yourself, you can download it from this link: ReadURL.exe

A webinar based on the content of this article can be viewed at: Calling-External-Applications.wmv (requires Windows Media Player)

Minimal Requirements

  • Analytica Enterprise 4.0
  • .NET 2.0 framework
  • Internet access (to Yahoo finance)
  • Optional: Microsoft Visual Studio 2005

Getting Started

First, you need to obtain ReadURL.exe. You can either download it from the link, or you can compile it yourself using the source code provided in the section at the end of this article. You should take note of the full path of your executable. In my case, after compiling it, the executable was at:

C:\Src\ReadURL\Debug\ReadURL.exe

Understanding Historical Stock Data on Yahoo Finance

The Yahoo Finance web site provides historical stock data for free, downloadable for each stock as a CSV file. Before actually downloading the data, take a few minutes to understand how you can access the data as a CSV file.

In a web browser, go to: http://finance.yahoo.com. In the Stock Symbol box enter: MSFT (the symbol for Microsoft) and press the Get Quotes button. On the page that results, enter a start date of Jan 3, 2007 and an end date of Aug 8, 2007 and press Get Prices.

Near the bottom of the page that results is a link that says "Download to spreadsheet". By hovering over the link, you can see the URL that loads the CSV file for MSFT between Jan 3, 2007 and Aug 8, 2007. It is:

http://ichart.finance.yahoo.com/table.csv?s=MSFT&a=0&b=1&c=2007&d=7&e=8&f=2007&g=d&ignore=.csv

Take a few moments to study the URL. The format of the URL should become clear fairly quickly. The «s» parameter is the stock symbol, the «a» parameter is one less than the day number in the start date, «b» is the month number of the start date, etc. We need to know this format because we want our Analytica model to construct this URL for a given stock symbol, start date, and end date.

Constructing the URL

Start Analytica. Create two variable nodes, Start_date and End_date, and set the number format for both to Date.

Set the definition of Start_date to: 3 Jan 2007

Set the definition of End_date to: Today()

Create a variable named Stock sym and define it to be MSFT

Now let's create a user-defined function (UDF) that constructs the necessary URL from the symbol, start date and end date. Drag a function object to your diagram and define it as shown:

Historical stock url.jpg

We can test this by creating another variable, named URL, with the definition:

Historical_Stock_URL(stock_sym, start_date, end_date)

Evaluating this variable returns the URL we saw earlier.

Reading the Data

Next, we'll create a UDF that reads the content from that URL as a long text string. Drag another function node to your diagram and define it as follows:

ReadFromUrl.jpg

You may need to modify the path in the first parameter to point to your copy of ReadURL.exe.

The file we'll read will be a CSV file, so to parse it we'll want to make use of the Flat File Library. Select FileAdd Library... and select Flat File Library.ana. You can either link or embed.

Now, create a variable node titled Textual Price Data defined as:

ParseCsvText(ReadFromUrl(Url))

Evaluate this node and you should the historical price data imported from Yahoo Finance.

Textual Price Data.jpg

Coercing the Data

The data just imported is currently all text, which means we cannot treat the cell contents numerically with arithmetic operations. Also, it might be preferable to use Date as an index (although this depends on your own needs). Here we will use Date as an index, but we'll name it Trading_Date.

If you plan to do much analysis using this data, it is more convenient to break each column out as a separate variable. This allows you to use an identifier such as Open_price in expressions rather than a Subscript operator like Data[Field = 'Open']. We'll demonstrate both methods here.

We'll use a variable node for our Trading_date index, rather than an index node. The reason for this is that Analytica evaluates Index nodes pro-actively (e.g., immediately when they are defined, or when the model is loaded), but evaluates variables nodes on-demand. A variable node can serve as an index just fine, with this former behavior being the key functional distinction.

Create a variable node, titled Trading Date, defined as:

CopyIndex(ParseDate(Textual_Price_Data[.Field = 'Date']))

Set its number format to Date.

Create an variable titled Fields, defined as:

subset( Textual_price_data.Field <> "Date")

this will serve as our field index in the full numeric table.

Now create a variable titled Stock Data defined as

ParseNum(Textual_price_data[@.Row = @Trading_date,.Field = Fields])

Here we are re-indexing by the two indexes defined previously, and converting the data to numeric. The result here is the stock data as numbers in a table form.

It may be more convenient to break out each column as a separate variable. For example, you can create a variable Open_Price defined as

ParseNum(Textual_Price_Data[.Field = 'Open'])

and repeat for High_Price, Low_Price, Close_Price, Volume, and Adj_Close.

Creating ReadURL.exe

In the above example, you have the option of using the ReadURL.exe provided, or you can compile it yourself if you have Visual Studio 2005. Going through the process of creating and compiling this simple application is a highly worthwhile exercise if you think you may need to create your own custom integrations or custom external computations.

To read data from the web, a simple C++/CLR application was created in Microsoft Visual Studio 2005. Select File → New → Project/Solution → C++ → CLR/.NET Console Application. We named our project ReadURL and placed it in C:\Src.

In the file ReadURL.cpp, we placed the following code:

// ReadURL.cpp : main project file.

#include "stdafx.h"

using namespace System;
using namespace System::Net;
using namespace System::IO;

int main(array<String ^> ^args)
{
	if (args->Length != 1) {
		Console::Error->WriteLine(L"Usage: ReadURL <url>");
		return -1;
	}

	String^ url = args[0];
	WebRequest^ request = WebRequest::Create(url);
	WebResponse^ response = nullptr;
	StreamReader^ reader = nullptr;
	try {
		response = request->GetResponse();
		Stream^ stream = response->GetResponseStream();
		reader = gcnew StreamReader(stream,Text::Encoding::UTF8);
		Console::Write(reader->ReadToEnd());

	} catch (WebException^ x) {
		Console::Error->WriteLine(x->Message);
		return -2;
	}

	if (response!=nullptr) response->Close();
	if (reader!=nullptr)   reader->Close();

	return 0;
}

This program accepts one command line parameter, a URL, and then reads from that URL and echos the text to StdOut.

See Also

Comments


You are not allowed to post comments.