NOSQL Meetup San Francisco – Cassandra

Presentation by Avinash Lakshman of Facebook. The presenter was one of the designers of Amazon’s Dynamo, and brought a lot of those ideas to Cassandra, a Bigtable clone used for Facebook’s mail search.

*Highly available; eventual consistency only
*Replication knobs available (like Dynamo)
*For performance, they rely on application specific behavior — when a user positions their mouse in their inbox search box, their index gets pulled into memory before the search term is entered.
*Does not like pluggable storage models; says that app-specific optimizations are too valuable to generalize away. Specific example was a zero-copy streaming network copy enabled by a Linux call — data can be streamed from/to disks over the network.
*180 node/50 TB system
*Messaging includes failure simulations (like dropping random messages)
*Future directions: ACLs (very familiar for Yahoos), transactions, compression. communicative ops, pluggable (app-specific) inconsistency reconciliation
*Simple designs are better designs.
*Test by tee-ing network traffic

Cassandra home page

Notes taken by Toby Negrin of Yahoo

Slides of the presentation (PDF)