My last post described how I’m using the Twitter API to receive tweets off the live stream.
Since then, I’ve used the API to filter for tweets containing the #runkeeper hashtag, and used that to scrape the user’s activity from the RunKeeper site (including the GPS points from the user’s exercise). I’ve stored that information in a MongoDB, which has allowed me to do some simple visualisation:
https://www.youtube.com/watch?v=b8njarJC6qE The above video (best played at 720p) shows activities plotted against time of day (the sun overhead in the video indicates midday for that region).
I haven’t charted this to confirm, but to my eyes it looks like amount of exercise activity peaks around sunrise and sunset, with almost none at night time (which isn’t really a surprise).
For the curious, this is what the colour of the dots indicate:
let createSphere (latitude:float) (longitude:float) (activityType:string) =
let color = match activityType with
| "RUN" -> "Red"
| "WALK" -> "Green"
| "BIKE" -> "Blue"
| "MOUNTAINBIKE" -> "Orange"
| "SKATE" -> "Yellow"
| "ROWING" -> "Cyan"
| "HIKE" -> "Brown"
| "OTHER" -> "Black"
| _ -> "Black"
String.Format(sphere, latitude, longitude, color)
On a local scale, the actual GPS traces are of interest:
Activities are present in the more-populated regions of the UK. The blue and orange traces indicating cycling and mountain biking activities are clearly visible.
NOTE: if you think the map looks weird, it doesn’t have the usual aspect ratio - I’m not using a Mercator projection do plot the points, but am simply plotting the longitude and latitude linearly.
On an even smaller scale, landmarks around London are visible:
The GPS data contains altitude information, so there are more interesting visualisations that could be done, including generating contour plots. Also, the above only contains three days worth of data - with a larger data set it would be possible to plot to determine whether activities peak on a weekend etc.
Acknowledgements:
The 3D plot of the globe was drawn using POV-Ray. The 3D globe model was from grabcad, with the converted to a POV-Ray model using PoseRay.
The UK outline was obtained from the NOAA.
The code was implemented in F# (which was a pleasure to get back to after the C++ I’ve been doing recently). I did try the MongoDB.FSharp library to store my data records, but they failed to deserialise from the database. In any case, I wanted more control over the data types saved (I wanted to store the GPS data as GeoJson3DGeographicCoordinates, with the first point stored separately as a GeoJson2DGeographicCoordinates with an index on this value). I could have created .NET classes and used the BSON serializer, but it seemed more effort than writing directly to the DB (and this is about the only time I’ve seen the benefit of C# implicit conversions, but I can live without them in F#).
Why use the Twitter API, and why not scrape the RunKeeper site directly? That’s because of times - the RunKeeper website displays the activity time, but it is displayed in local time, and it’s not clear whether that’s in the user’s timezone, or the timezone of the activity. It seems cleaner to instead assume that the tweet has been posted as soon as the activity is finished and store that time as UTC (this assumption may of course not be true, but the results seem realistic).
Show me the Code!
The code’s not in a bad shape, but I would like to tidy it up a little before releasing it into the world. I’m busy with other things at the moment, but if I get much interest I can go ahead and do that…