ApacheCon Europe 2012

Rhein-Neckar-Arena, Sinsheim, Germany

5–8 November 2012

Cassandra concepts, patterns and anti-patterns

Dave Gardner

Audience level:
Beginner
Track:
Big Data

Wednesday 9 a.m.–9:45 a.m. in Rhein-Neckar

Description

An introduction to the fundamental concepts behind Apache Cassandra. This talk explains the engineering principles that make Cassandra such an attractive choice for building highly resilient and available systems and then goes on to explain how to use it - covering basic data modelling patterns and anti-patterns.

Abstract

This talk starts with some fundamental trade offs made when choosing NoSQL. It talks about some of the things you would be giving up by using Cassandra (ACID, ad-hoc querying) and the things you get in return (proven horizontal scalability, availability, performance, operational simplicity and a rich data model).

To explain how these features are delivered, we go back to the fundamentals of the Amazon Dynamo paper and the Google BigTable paper, explaining concepts such as eventual consistency, tuneable consistency, hinted handoff, read repair, columns, Memtables, SSTables and conflict resolution.

We then look at these fundamental principles from a more practical angle, looking at the basics of data modelling in Cassandra. We use practical examples to explain commonly used patterns for storing data and some of the Cassandra features that can help, such as counters and expiring columns. Finally we cover off some anti-patterns that should be avoided.