A day in the life of Facebook Operations

*
Accepted Session
Short Form
Scheduled: Thursday, June 3, 2010 from 3:45 – 4:30pm in Morrison

Excerpt

A look at the tools and practices used at Facebook to support the #2 site in the world.

Description

Facebook is now the #2 global website, responsible for billions of photos, conversations, and interactions between people all around the world running on top of tens of thousands of servers spread across multiple geographically-separated datacenters. When problems arise in the infrastructure behind the scenes it directly impacts the ability of people to connect and share with those they care about around the World.

Facebook’s Technical Operations team has to balance this need for constant availability with a fast-moving and experimental engineering culture. We release code every day. Additionally, we are supporting exponential user growth while still managing an exceptionally high radio of users per employee within engineering and operations.

This talk will go into how Facebook is “run” day-to-day with particular focus on actual tools in use (configuration management systems, monitoring, automation, etc), how we detect anomalies and respond to them, and the processes we use internally for rapidly pushing out changes while still keeping a handle on site stability.

Speaking experience

Speaker