Serialist: lazy web-crawling in Haskell

*
Accepted Session
Short Form
Scheduled: Wednesday, June 2, 2010 from 3:45 – 4:30pm in Fremont

Excerpt

Serialist (http://serialist.net/) provides a way to find, track and read serialized content (e.g., web comics). It's implemented entirely in Haskell and demonstrates functional web application development, crawling, scraping and distributed architecture. Serialist uses interesting graph algorithms to add and step through content lazily.

Description

We’ll present Serialist, our site for keeping track of the webcomics and stories that we read.

We implemented Serialist entirely in Haskell. Serialist demonstrates functional web-application development, web crawling and scraping, distributed architecture in Haskell, and interesting graph algorithms.

Other sites exist for tracking webcomics updates, but require manual intervention from a moderator or administrator, often involving writing new page-scraping code for each serial. Our graph algorithms let us accept user submissions for new serials to crawl, making them available immediately. Haskell allowed us to concisely express our graph analyses, and run them over a lazy link-graph of the Internet.

Speaking experience

Speakers