Blogging a race with R

I’ve been blogging extensively about this year’s World Solar Challenge. My main tool for doing so has been R. Having put together a database of teams and team data, a set of R scripts generated web pages (like this one and that one) from the database. Using R made it easy to incorporate graphs and analyses of team data within those pages. For example:


Also useful were R scripts to extract additional structured data from the World Solar Challenge web site using XML parsing (with the RCurl and XML libraries), R scripts to scan Twitter feeds of race teams (interfacing to a Python script which did the actual downloading, because of weaknesses in the R interface to Twitter), and R scripts to generate various maps (primarily using the raster package). Examples of such maps include this temperature map of Australia in October:

An R script for parsing data from was used to produce (and regularly update) this calendar:

Additional R scripts were used to generate a number of infographics, such as these:


During the race itself, serious data quality problems presented themselves. Official timing data contained multiple errors, while GPS tracking data suffered from time lags greater than the gaps between teams. This created a need for code to do data sanity-checking, to do data cleaning, and to do car position extrapolation. R was very useful for writing such tools on the fly. The map below shows raw GPS data for car positions (overlayed on a NASA raster image), and was produced using code written during the race:

The chart below summarises official timing data, and was produced using code modified from that used to report on the 2013 race:

This chart of official Cruiser class results was also produced using code modified from that used in 2013:

New code was used to produce this chart of Cruiser class cars that partially trailered:

Many other charts and web pages were produced during race coverage. In each case, R provided useful facilities for acquiring, visualising, and organising data. Generating HTML from R scripts and a database also proved very successful. In hindsight, virtually all blog posts should have been generated this way.

Finally, a few small touches of humour do not go astray. Putting together this image, for example, was quite popular:


2 thoughts on “Blogging a race with R

  1. I really loved all of you posts and to add the graphic twist to analysis made it all the more visible and fun to study. Thanks a lot once again, Wizard of Oz of R!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s