Friday, August 26, 2011

25+ more ways to bring data into R

The rdatamarket post on the Revolutions blog and this post on Decision Stats reminded me about my list of Data APIs/feeds available as packages in R on Cross-Validated (which is a great site that you all should use).  Many of these packages are from Omega Hat, which is an awesome site.

This is the most comprehensive list I'm aware of, so please alert me to any ommisions:

Because it's Friday: Spurious correlation edition

If the Flight of the Concords taught me anything, it's that you can't trust Australians. This morning I was poking around the DataMarket site, when I noticed something suspicious about Australian sheep production:

Tuesday, August 23, 2011

Graphically analyzing variable interactions in R

I studied Ecology as an undergraduate, which meant I spent a lot of time gathering and analyzing field data. One of the basic tools we used to look for relationships in a large set of variables was correlation and scatterplot matrices. Each of these requires a single line of code in R:




Monday, August 22, 2011

Recession forecasting II: Assessing Hussman's Accuracy

In my last post on recessions, I implemented John Hussman's Recession Warning Composite in R. In this post I will examine how well this index performs and discuss how we might improve it. If you would like to follow along at home, be sure to run the code from the last post, before running anything from this post.

Wednesday, August 10, 2011

Using the google prediction API from R

Google has a "black box" prediction API that they provide for use with creating recommender systems or filtering spam. Furthermore, they provide an R package for interfacing that API, but try as I might I cannot get it to work under windows. Here are the instructions for setting up the API to run in R under linux. I haven't tried this out yet, so let me know in the comments if it works, or if you can get it to run on Windows.

Scraping web data in R

In my last post, I went through a lot of effort to scrape the PMI index off the ISM website.  It turns out that was unnecessary effort, as commentator "senne" pointed out that this index is available from FRED, with the symbol NAPM.  I've updated my code, which now pulls all the data straight from FRED.

However, it was surprisingly easy to scrape web data into R, using the readHTMLTable function in the XML package.  I thought I'd keep the code I used on my blog, as it's a good example of how easily you can pull web data into R.



Tuesday, August 9, 2011

Forecasting recessions

John Hussman has a Recession Warning Composite that I am attempting to replicate/improve. The underlying data seems to be easy enough to get from FRED using the quantmod package in R. I don't quite understand the index Hussman is using for commercial paper, so I used the '3-month AA financial commercial paper index' from FRED.

Sociable