The Big Data Exploratorium: Data Mining, from Patents to Memes

Accepted Session
Short Form
Scheduled: Wednesday, June 22, 2011 from 2:30 – 3:15pm in B302/03


Learn to use simple natural language processing and graph analysis tools in Python and R to explore the structure of the dataverse. From Reddit to the USPTO to Google Books, come try some data hacks!


Dive deep into the suffocating postmodern ocean of data and come out alive on this interactive tour of low-hanging-fruit data mining tools. Learn to make R graphs both pretty and pretty informative; crunch numbers quickly in high level languages, like a boss; even get your Google on with some maps and reduces. It is widely known: Data scientists have all the fun, so join us!

As part of the talk, we plan to make available a data set of historical Reddit front pages, to plumb the very depths of nerd humor evolution.

Speaking experience