Table of Contents

1 General

1.1 Row-oriented

  • Paritioned row store, in which data is stored in sparse multidimensional hashtables.
    • sparse = any given row can have one or more columns, but each row doesn't have to have all the same columns
    • paritioned = each row has unique key which makes data accessible
      • Keys distribute the rows across multiple data stores.
  • Cassandra stores data in a multidimensional, sorted hash table.
  • Data stored in each column is stored as a seperate entry in the hash table.

2 Data model

Column
is a name/value pair
Row
is a container for columns referenced by a primary key/row key
Table
is a container of rows
Keyspace
is a container for tables
Cluster
is a container for keyspaces that spans one or more nodes

2.1 Clusters

Cassandra is designed to be distributed over several machines operating together that appear as a single instance → cluster, also called ring, is the outermost structure.

2.2 Keyspaces

  • Outermost container for data
  • Container for tables
  • Defined by a name and set of attributes

2.3 Tables

  • Container of an ordered collection of rows
    • Where each row is a container of columns
    • Ordering is determined by the columns, which are identified as keys

test.png

3 Testing

You use the cassandra-stress tool together with some user-defined YAML files. Is quite flexible and functions really well for quickly testing schemas.

4 Noteworthy

4.1 A deep look into Cassandra's where clause

5 Resources