Cat-herd's Crook: Enforcing Standards in 10 Programming Languages

*
Accepted Session
Short Form
Intermediate
Scheduled: Tuesday, June 23, 2015 from 2:30 – 3:15pm in B201

Excerpt

At MongoDB we write open source database drivers in ten programming languages. Ideally, all behave the same. We also help developers in the MongoDB community replicate our libraries’ behavior in even more (and more exotic) languages. How can we herd these cats along the same track? For years we failed, but we’ve recently gained momentum on standardizing our libraries. Testable, machine-readable specs prove which code conforms and which does not.

Description

We are two engineers who help build an open-source database called MongoDB and its language-specific drivers. These drivers allow coders to use MongoDB in their programming languages of choice. Some drivers are maintained by MongoDB Inc., some are maintained by the community, and some have mixed histories.

MongoDB drivers are as idiosyncratic as the people who maintain them. Historically we had no specs, so driver authors wrote ever more divergent code, with different features (and different bugs) making it difficult for users to transition from one language to another. As a company, we couldn’t define what we offered in a language-neutral way. Our support team couldn’t learn how to support drivers because they were too inconsistent. Because there was no standardization and little communication, individual authors could not learn from each others’ mistakes or successes.

Clearly we had a problem, one that persisted for years. Why didn’t we fix it sooner? Reviewing implementations of a feature across all drivers was too difficult for any individual. Spec authors faced driver authors’ stubborn attachments to their specific designs. Drivers written outside MongoDB Inc. were hard to regulate, and we struggled to balance an open-source anarchic spirit with company goals.

We had several false starts in solving this problem. First, we wrote loose specs with no tests: authors ignored them and and we couldn’t hold their implementations accountable. Next we tried writing reference implementations, but again, there was no way to prove driver adherence. Then we wrote tests in English, but they remained vague, unmaintainable, and unenforceable. Cucumber, a testing tool from the Ruby world, was our best attempt, but it was loudly rejected by authors who didn’t care for its aesthetics or couldn’t support it in their languages.

Last year we found something that works: we write specs in English and tests in YAML, a simple language that describes data. Driver authors write harnesses for each suite of YAML tests, translating them into actions to perform in their drivers. Some YAML tests are specific enough to act as unittests, some are broad enough to function as integration tests. Each driver runs the same tests and proves it’s up to spec.

This solution ends the eternal debate over spec compliance. It encourages future specs, because now they work! It overcomes barriers to communication between internal and external driver authors, and makes disputes over specs less personal.

We are beginning to standardize our drivers and to invite third party drivers to do the same. With this system, we can validate third-party drivers and prove them to be as trustworthy as internally-developed ones, an open source ideal.

Tags

mongodb, specifications, cross-language, community

Speaking experience

Samantha a first-time conference speaker, and this is a new talk. She has given the following talks at smaller events:

- “We can write better YAML” talk at MongoDB Engineering Offsite, 2014
- “Distinguish between () and {} when creating objects” technical presentation at MongoDB C++ Reading Group, 2014
- MongoDB Training Session, with between 6 and 12 students over two days, 2013
- “Getting to Know MongoDB: an Introduction to the API” at HackPrinceton, 2013
- “Basic Marionette Manipulation,” a puppet workshop at Princeton University, 2013
- “Edda: a log visualizer for MongoDB” tech talk at MongoDB, 2012

Jesse is a seasoned conference speaker. He recently gave talks at PyCon, PyGotham, WindyCityDB, MongoDB Chicago, various MongoDB Meetups, and various Python Meetups.

Speakers

  • Profile

    Biography

    Staff Engineer at MongoDB in New York City. Author of Motor, an async MongoDB driver for Tornado, and of Toro, a library of locks and queues for Tornado coroutines. Contributor to Python, PyMongo, MongoDB, Tornado, and asyncio.

    Sessions

      • Title: Cat-herd's Crook: Enforcing Standards in 10 Programming Languages
      • Track: Cooking
      • Room: B201
      • Time: 2:303:15pm
      • Excerpt:

        At MongoDB we write open source database drivers in ten programming languages. Ideally, all behave the same. We also help developers in the MongoDB community replicate our libraries’ behavior in even more (and more exotic) languages. How can we herd these cats along the same track? For years we failed, but we’ve recently gained momentum on standardizing our libraries. Testable, machine-readable specs prove which code conforms and which does not.

      • Speakers: Samantha Ritter, A. Jesse Jiryu Davis
      • Title: How Do Python Coroutines Work?
      • Track: Chemistry
      • Room: B201
      • Time: 1:302:15pm
      • Excerpt:

        Asynchronous I/O frameworks like Node, Twisted, Tornado, and Python 3.4’s new “asyncio” can efficiently scale past tens of thousands of concurrent connections. But async coding with callbacks is painful and error-prone. Programmers increasingly use coroutines in place of callbacks to get the best of both worlds: efficiency plus a natural and robust coding style. I’ll explain how asyncio’s coroutines work. They are built using Python generators, the “yield from” statement, and the Future and Task classes. You will gain a deep understanding of this miraculous new programming idiom in the Python standard library.

      • Speakers: A. Jesse Jiryu Davis
  • 919579 10152838815265026 1812931628 o

    Samantha Ritter

    MongoDB Inc.

    Biography

    Samantha is a software engineer at MongoDB, where she works in assorted languages including C, C++, and Ruby. At work, she enjoys systems programming, learning the finer points of YAML, and the satisfaction that comes with a good refactoring of bad code. In life, she cooks a lot, eats a lot, and sings in a dream-pop band.

    Sessions