I spend a lot of time trawling the internet for data, particularly economic and financial data. Yahoo Finance and Google Finance are handy for market data and “FRED”, the St. Louis Fed is an excellent, albeit US-centric, resource for a broad range of financial aggregates. While these sites make it very easy to automate data downloads, most sites (including, unfortunately, the Australian Bureau of Statistics) provide data in Excel format or other inconvenient forms. At times this has become sufficiently frustrating that I have periodically entertained vague plans to build my own time-series data web-site that would source data from across the world and the web, making it available in consistent, useful way.
Needless to say, I never got around to it, but it seems that someone else has. Today I stumbled across Quandl, which aggregates and re-publishes over 5 million time-series. The data can be presented as charts on their website, downloaded or accessed programmatically through their application programming interface (API). There is even an R package available to make it easy to load data directly into my favourite statistical package, R.
Here is an example of how it all works. Quandl has data on the Australian All Ordinaries index. To read this data into R, you will first need to register with Quandl and obtain an authentication key for the API. This key is a random string, which looks something like this jEGfHz9HF7C3zTus6ZuK (this one is not a real key!). Once you have your key, you can fire up R and install and load the R package by entering the following commands:
install.packages("Quandl")
library(Quandl)
Once this is done, you will need to find the Quandl code for the data you are interested in. Near the bottom of the Quandl page, there is a pane showing the data-set information, including the provenance of the data.
Armed with the text labelled “Quandl Code”, in this case “YAHOO/INDEX_AORD”, you now have everything you need. I will assume you already have the ggplot2 and scales packages installed. To plot the history of the All Ordinaries, simply enter the following code (replacing the string in the third line with your own authentication key).
library(ggplot2)
library(scales)
Quandl.auth("jEGfHz9HF7C3zTus6ZuK")
aord ggplot(aord, aes(x=Date, y=Close)) + geom_line() + labs(x="")
I can see I am going to have fun with Quandl. It even has Bitcoin price history. But that is a subject for another post.