Firefox Crash Reporting: Using Big Data in Your Open Source Project

Accepted Session
Short Form
Scheduled: Tuesday, June 26, 2012 from 1:30 – 2:15pm in B204


Learn how Mozilla collects and analyzes three million crash reports a day with Python, PHP, PostgreSQL and HBase.


Every day, Mozilla collects three million Firefox crash reports from around the world. The data in these reports drives the bug-fixing priorities of Firefox engineers, and is critical to understanding the stability of our platform. In this case study, I’ll describe the challenges we’ve faced, the types of questions the system can be used to answer, and the architecture and infrastructure we use to process, store, and analyze approximately 110TB of crash reports using Python, PHP, Hadoop, PostgreSQL, and a few other things thrown in for good measure.

All the software we use in our stack is Open Source, including the Breakpad client embedded in the browser, and the Socorro collection and reporting system. Other projects and companies are now using the Breakpad/Socorro combination.

Speaking experience

I gave this talk at this year at the Open Browser miniconf. I have given more than 50 conference and user group talks, and taught in a Masters program for six years.