Democratization of infrastructure: Monitoring with nagios and graphite

*
Proposal
Short Form
Intermediate

Excerpt

Git is cool. Configuration is code. The simplicity of a monitoring check or metrics collector enables junior system administrators to learn in small, contained parts. Jr. admins can go from not knowing what monitoring is to having a check in production in a manner of hours.

Description

Nagios and Graphite are configured in small bits. A nagios check or a graphite metric collector is a very small and self-contained piece of a larger whole. New system administrators can write one of these in a few minutes, without any previous knowledge. This lowers the barrier to entry of systems administration.

Sysadmins are often unable to make any improvements to the system until they understand the entire system, but setting up a single check or collector requires a much smaller set of knowledge. By managing the configuration of these utilities in git and using the standard branch, commit, pull request pattern, we can expand the ‘configuration is code’ goal into monitoring and metrics collection.

The CAT has a large body of student volunteers who want to contribute to the infrastructure, but shouldn’t necessarily have root access to do that. We needed to create a system for them to test and add new checks and metrics, but still make it easy enough for a volunteer to accomplish meaningful contributions without having to dedicate their life to learning several new systems.

Speaking experience

Cascadia IT Conf, Beaver Bar Camp

Speakers