NoSQL design provides scalability and high availability, instead of ACID (Atomicity, Consistency, Isolation, Durability) [2] guarantee, like other more traditional RDBMS solutions. Cassandra employs the BASE (Basically Available, Soft-state, Eventual consistency) [3] principles which puts it in between the "available" and "partition tolerant" arm of the CAP theorem [4] triangle though such classifications are mostly meant to help answering the question "what is the default behaviour of the distributed system when a partition happens":
A Cassandra cluster is called a ring - each node consists of multiple virtual nodes (vnodes) responsible for a single continuous range of rows with token values (a hash value of a row key). Cassandra is a peer-to-peer system where data is distributed among all nodes in the ring. Each node exchanges information across the cluster every second using a gossip protocol. A partitioner determines how to distribute the data across the nodes in the cluster and which node to place the first copy of data on.
When a client sends a write request it can connect to any node in the ring. That node is called the coordinator node. In turn it delegates the write request to a StorageProxy service, which determines what nodes are responsible for that data. It identifies the nodes using a mechanism called a Snitch. A Snitch defines groups of machines that the replication strategy uses to place replicas of the data. Once the replica nodes are identified the coordinator node send a RowMutation message to them and then waits for a confirmation that the data was written. It only waits for some nodes to confirm, based on a pre-configured consistency level. If the nodes are in multiple datacenters the message is send to one replica in each data center with a special header telling it to forward the request to other nodes in that data center. The nodes that receive the RowMutation message first append it to the commit log, then to a MemTable and finally the MemTable is flushed to disk in a structure called SSTable. Periodically the SSTables are merged in a process called compaction.
When a client needs to read data back, it again connects to any node, the StorageProxy gets a list of nodes containing the requested key based on the replication strategy, the proxy node sorts the returned candidate nodes based on proximity using Snitch function (configurable). Once a node is selected the read request is forwarded to it for execution. That node then first attempts to read the data from its MemTable. If the data is not in memory Cassandra then looks into a SSTable on disk utilizing a bloom filter. At the same time, other nodes that are responsible for storing the same data will respond back with just a digest, without the actual data. If the digest does not match on some of the nodes, data repair process is started and those nodes will eventually get the latest data and become consistent. For further information on the architecture of Cassandra refer to [5].
The data in Cassandra is stored in a nested hashmap (a hashmap containing a hash map, which is basically data structure with key-value pairs) and it can be visualized as the following:
The keyspace is similar to a database and it stores the column families, along with other properties like the replication factor and replica placement strategies. The properties of the keyspace apply to all tables contained within the keyspace.
The column family is similar to a table and contain a collection of rows, where each row contains cells.
A cell is the smallest data unit (a triplet) that holds data in the form of "key:value:time". The timestamp is used to resolve consistency discrepancies during data repairs from inconsistent digests.
The row key uniquely identifies a row. Since each node in a ring contains only a subset of rows (the rows are distributed among the nodes) the row keys are sharded as well.
With all this in mind let's deploy two nodes, single DC cluster. First let's install the prerequisites:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-nodes]$ lsb_release -d | |
Description: Ubuntu 14.04.4 LTS | |
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys F758CE318D77295D | |
[cassandra-nodes]$ gpg --export --armor F758CE318D77295D | sudo apt-key add - | |
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys 2B5C1B00 | |
[cassandra-nodes]$ gpg --export --armor 2B5C1B00 | sudo apt-key add - | |
[cassandra-nodes]$ gpg --keyserver pgp.mit.edu --recv-keys 0353B12C | |
[cassandra-nodes]$ gpg --export --armor 0353B12C | sudo apt-key add - | |
[cassandra-nodes]$ add-apt-repository ppa:webupd8team/java | |
[cassandra-nodes]$ apt-get update | |
[cassandra-nodes]$ apt-get install oracle-java8-installer | |
[cassandra-nodes]$ apt-get install libjna-java | |
[cassandra-nodes]$ update-alternatives --list java | |
/usr/lib/jvm/java-8-oracle/jre/bin/java | |
[cassandra-nodes]$ java -version | |
java version "1.8.0_91" | |
Java(TM) SE Runtime Environment (build 1.8.0_91-b14) | |
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode) |
Then let's install Cassandra:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-nodes]$ echo "deb http://www.apache.org/dist/cassandra/debian 35x main" > /etc/apt/sources.list.d/cassandra.list | |
[cassandra-nodes]$ apt-get update | |
[cassandra-nodes]$ apt-get install cassandra |
The main configuration file is very well documented and most of the defaults are quite sensible. The few changes for the purpose of this blog are as follows:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-nodes]$ cat /etc/cassandra/cassandra.yaml | |
cluster_name: 'TestCluster' | |
authenticator: PasswordAuthenticator | |
authorizer: CassandraAuthorizer | |
role_manager: CassandraRoleManager | |
partitioner: org.apache.cassandra.dht.Murmur3Partitioner | |
data_file_directories: | |
- /var/lib/cassandra/data | |
commitlog_directory: /var/lib/cassandra/commitlog | |
seed_provider: | |
- class_name: org.apache.cassandra.locator.SimpleSeedProvider | |
parameters: | |
- seeds: "10.176.64.41" | |
disk_optimization_strategy: spinning | |
listen_interface: eth1 | |
rpc_interface: eth1 | |
endpoint_snitch: GossipingPropertyFileSnitch | |
[cassandra-nodes]$ cat /etc/cassandra/cassandra-rackdc.properties | |
dc=iad3 | |
rack=rack1 | |
[cassandra-nodes]$ /etc/init.d/cassandra start | |
[cassandra-node-1]$ cat /var/log/cassandra/system.log | |
INFO [main] 2016-05-29 18:53:46,794 CassandraDaemon.java:428 - Par Survivor Space Heap memory: init = 20971520(20480K) used = 0(0K) committed = 20971520(20480K) max = 20971520(20480K) | |
INFO [main] 2016-05-29 18:53:46,795 CassandraDaemon.java:428 - CMS Old Gen Heap memory: init = 836763648(817152K) used = 0(0K) committed = 836763648(817152K) max = 836763648(817152K) | |
INFO [main] 2016-05-29 18:53:46,795 CassandraDaemon.java:430 - Classpath: /etc/cassandra:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/cassandra/hs_err_1464548024.log] | |
INFO [main] 2016-05-29 18:53:46,853 CLibrary.java:126 - JNA mlockall successful | |
WARN [main] 2016-05-29 18:53:46,854 StartupChecks.java:118 - jemalloc shared library could not be preloaded to speed up memory allocations | |
WARN [main] 2016-05-29 18:53:46,854 StartupChecks.java:150 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. | |
INFO [main] 2016-05-29 18:53:46,856 SigarLibrary.java:44 - Initializing SIGAR library | |
WARN [main] 2016-05-29 18:53:46,869 SigarLibrary.java:174 - Cassandra server running in degraded mode. Is swap disabled? : true, Address space adequate? : true, nofile limit adequate? : true, nproc limit adequate? : false | |
INFO [main] 2016-05-29 18:53:47,696 ColumnFamilyStore.java:395 - Initializing system.IndexInfo | |
INFO [main] 2016-05-29 18:53:49,147 ColumnFamilyStore.java:395 - Initializing system.batches | |
INFO [main] 2016-05-29 18:53:49,157 ColumnFamilyStore.java:395 - Initializing system.paxos | |
INFO [main] 2016-05-29 18:53:49,172 ColumnFamilyStore.java:395 - Initializing system.local | |
INFO [SSTableBatchOpen:2] 2016-05-29 18:53:49,194 BufferPool.java:226 - Global buffer pool is enabled, when pool is exahusted (max is 512 mb) it will allocate on heap | |
INFO [main] 2016-05-29 18:53:49,234 CacheService.java:113 - Initializing key cache with capacity of 48 MBs. | |
INFO [main] 2016-05-29 18:53:49,244 CacheService.java:135 - Initializing row cache with capacity of 0 MBs | |
INFO [main] 2016-05-29 18:53:49,246 CacheService.java:164 - Initializing counter cache with capacity of 24 MBs | |
INFO [main] 2016-05-29 18:53:49,247 CacheService.java:175 - Scheduling counter cache save to every 7200 seconds (going to save all keys). | |
INFO [main] 2016-05-29 18:53:49,272 ColumnFamilyStore.java:395 - Initializing system.peers | |
INFO [main] 2016-05-29 18:53:49,297 ColumnFamilyStore.java:395 - Initializing system.peer_events | |
INFO [main] 2016-05-29 18:53:49,308 ColumnFamilyStore.java:395 - Initializing system.range_xfers | |
INFO [main] 2016-05-29 18:53:49,316 ColumnFamilyStore.java:395 - Initializing system.compaction_history | |
INFO [main] 2016-05-29 18:53:49,331 ColumnFamilyStore.java:395 - Initializing system.sstable_activity | |
INFO [main] 2016-05-29 18:53:49,344 ColumnFamilyStore.java:395 - Initializing system.size_estimates | |
INFO [main] 2016-05-29 18:53:49,355 ColumnFamilyStore.java:395 - Initializing system.available_ranges | |
INFO [main] 2016-05-29 18:53:49,365 ColumnFamilyStore.java:395 - Initializing system.views_builds_in_progress | |
INFO [main] 2016-05-29 18:53:49,371 ColumnFamilyStore.java:395 - Initializing system.built_views | |
INFO [main] 2016-05-29 18:53:49,377 ColumnFamilyStore.java:395 - Initializing system.hints | |
INFO [main] 2016-05-29 18:53:49,384 ColumnFamilyStore.java:395 - Initializing system.batchlog | |
INFO [main] 2016-05-29 18:53:49,390 ColumnFamilyStore.java:395 - Initializing system.schema_keyspaces | |
INFO [main] 2016-05-29 18:53:49,395 ColumnFamilyStore.java:395 - Initializing system.schema_columnfamilies | |
INFO [main] 2016-05-29 18:53:49,401 ColumnFamilyStore.java:395 - Initializing system.schema_columns | |
INFO [main] 2016-05-29 18:53:49,406 ColumnFamilyStore.java:395 - Initializing system.schema_triggers | |
INFO [main] 2016-05-29 18:53:49,411 ColumnFamilyStore.java:395 - Initializing system.schema_usertypes | |
INFO [main] 2016-05-29 18:53:49,417 ColumnFamilyStore.java:395 - Initializing system.schema_functions | |
INFO [main] 2016-05-29 18:53:49,423 ColumnFamilyStore.java:395 - Initializing system.schema_aggregates | |
INFO [main] 2016-05-29 18:53:50,136 StorageService.java:600 - Populating token metadata from system tables | |
INFO [main] 2016-05-29 18:53:50,298 StorageService.java:607 - Token metadata: Normal Tokens: | |
/10.176.64.41:[-9093342176872828671, ...] | |
INFO [main] 2016-05-29 18:53:50,317 ColumnFamilyStore.java:395 - Initializing system_schema.keyspaces | |
INFO [main] 2016-05-29 18:53:50,367 ColumnFamilyStore.java:395 - Initializing system_schema.tables | |
INFO [main] 2016-05-29 18:53:50,416 ColumnFamilyStore.java:395 - Initializing system_schema.columns | |
INFO [main] 2016-05-29 18:53:50,449 ColumnFamilyStore.java:395 - Initializing system_schema.triggers | |
INFO [main] 2016-05-29 18:53:50,497 ColumnFamilyStore.java:395 - Initializing system_schema.dropped_columns | |
INFO [main] 2016-05-29 18:53:50,515 ColumnFamilyStore.java:395 - Initializing system_schema.views | |
INFO [main] 2016-05-29 18:53:50,534 ColumnFamilyStore.java:395 - Initializing system_schema.types | |
INFO [main] 2016-05-29 18:53:50,552 ColumnFamilyStore.java:395 - Initializing system_schema.functions | |
INFO [main] 2016-05-29 18:53:50,565 ColumnFamilyStore.java:395 - Initializing system_schema.aggregates | |
INFO [main] 2016-05-29 18:53:50,578 ColumnFamilyStore.java:395 - Initializing system_schema.indexes | |
INFO [main] 2016-05-29 18:53:50,788 ColumnFamilyStore.java:395 - Initializing system_distributed.parent_repair_history | |
INFO [main] 2016-05-29 18:53:50,793 ColumnFamilyStore.java:395 - Initializing system_distributed.repair_history | |
INFO [main] 2016-05-29 18:53:50,803 ColumnFamilyStore.java:395 - Initializing system_auth.resource_role_permissons_index | |
INFO [main] 2016-05-29 18:53:50,830 ColumnFamilyStore.java:395 - Initializing system_auth.role_members | |
INFO [main] 2016-05-29 18:53:50,838 ColumnFamilyStore.java:395 - Initializing system_auth.role_permissions | |
INFO [main] 2016-05-29 18:53:50,852 ColumnFamilyStore.java:395 - Initializing system_auth.roles | |
INFO [main] 2016-05-29 18:53:50,866 ColumnFamilyStore.java:395 - Initializing system_traces.events | |
INFO [main] 2016-05-29 18:53:50,870 ColumnFamilyStore.java:395 - Initializing system_traces.sessions | |
INFO [pool-2-thread-1] 2016-05-29 18:53:50,875 AutoSavingCache.java:189 - reading saved cache /var/lib/cassandra/saved_caches/KeyCache-d.db | |
INFO [pool-2-thread-1] 2016-05-29 18:53:50,890 AutoSavingCache.java:165 - Completed loading (17 ms; 25 keys) KeyCache cache | |
INFO [main] 2016-05-29 18:53:50,906 CommitLog.java:171 - Replaying /var/lib/cassandra/commitlog/CommitLog-6-1464374244502.log, /var/lib/cassandra/commitlog/CommitLog-6-1464374244503.log | |
INFO [ScheduledTasks:1] 2016-05-29 18:53:51,729 TokenMetadata.java:448 - Updating topology for all endpoints that have changed | |
INFO [main] 2016-05-29 18:53:52,088 CommitLog.java:173 - Log replay complete, 43 replayed mutations | |
INFO [main] 2016-05-29 18:53:52,093 StorageService.java:600 - Populating token metadata from system tables | |
INFO [main] 2016-05-29 18:53:52,114 StorageService.java:607 - Token metadata: Normal Tokens: | |
/10.176.64.41:[-9093342176872828671, ...] | |
INFO [main] 2016-05-29 18:53:52,197 StorageService.java:618 - Cassandra version: 3.5 | |
INFO [main] 2016-05-29 18:53:52,198 StorageService.java:619 - Thrift API version: 20.1.0 | |
INFO [main] 2016-05-29 18:53:52,199 StorageService.java:620 - CQL supported versions: 3.4.0 (default: 3.4.0) | |
INFO [main] 2016-05-29 18:53:52,274 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 48 MB and a resize interval of 60 minutes | |
INFO [main] 2016-05-29 18:53:52,276 StorageService.java:639 - Loading persisted ring state | |
INFO [main] 2016-05-29 18:53:52,291 StorageService.java:828 - Starting up server gossip | |
INFO [main] 2016-05-29 18:53:52,366 TokenMetadata.java:429 - Updating topology for /10.176.65.71 | |
INFO [main] 2016-05-29 18:53:52,368 TokenMetadata.java:429 - Updating topology for /10.176.65.71 | |
INFO [main] 2016-05-29 18:53:52,404 MessagingService.java:557 - Starting Messaging Service on /10.176.65.71:7000 (eth1) | |
INFO [MessagingService-Incoming-/10.176.64.41] 2016-05-29 18:53:52,459 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds | |
INFO [main] 2016-05-29 18:53:52,480 StorageService.java:1003 - Using saved tokens [-1101707182484276762, ...] | |
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 18:53:52,518 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 18:53:52,577 Gossiper.java:1028 - Node /10.176.64.41 has restarted, now UP | |
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 18:53:52,580 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 18:53:52,589 StorageService.java:2081 - Node /10.176.64.41 state jump to NORMAL | |
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,593 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP | |
INFO [main] 2016-05-29 18:53:52,640 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL | |
INFO [GossipStage:1] 2016-05-29 18:53:52,655 TokenMetadata.java:429 - Updating topology for /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 18:53:52,657 TokenMetadata.java:429 - Updating topology for /10.176.64.41 | |
INFO [main] 2016-05-29 18:53:52,677 AuthCache.java:172 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000) | |
INFO [main] 2016-05-29 18:53:52,682 CassandraDaemon.java:639 - Waiting for gossip to settle before accepting client requests... | |
WARN [GossipTasks:1] 2016-05-29 18:53:53,363 FailureDetector.java:287 - Not marking nodes down due to local pause of 5620946758 > 5000000000 | |
INFO [main] 2016-05-29 18:54:00,685 CassandraDaemon.java:670 - No gossip backlog; proceeding | |
INFO [main] 2016-05-29 18:54:00,764 NativeTransportService.java:70 - Netty using native Epoll event loop | |
INFO [main] 2016-05-29 18:54:00,813 Server.java:161 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c] | |
INFO [main] 2016-05-29 18:54:00,813 Server.java:162 - Starting listening for CQL clients on /10.176.65.71:9042 (unencrypted)... | |
INFO [main] 2016-05-29 18:54:00,850 CassandraDaemon.java:471 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it | |
[cassandra-node-2]$ cat /var/log/cassandra/system.log | |
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 18:53:52,420 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71 | |
INFO [GossipStage:1] 2016-05-29 18:53:52,562 Gossiper.java:1028 - Node /10.176.65.71 has restarted, now UP | |
INFO [GossipStage:1] 2016-05-29 18:53:52,576 TokenMetadata.java:429 - Updating topology for /10.176.65.71 | |
INFO [GossipStage:1] 2016-05-29 18:53:52,577 TokenMetadata.java:429 - Updating topology for /10.176.65.71 | |
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 18:53:52,589 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71 | |
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,626 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP | |
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP | |
INFO [SharedPool-Worker-2] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP | |
INFO [SharedPool-Worker-1] 2016-05-29 18:53:52,679 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP | |
INFO [GossipStage:1] 2016-05-29 18:53:52,864 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL |
The few important options are the cluster name, enabling authentication and authorization, specifying the IP of the seed node and the Snitch. In this example I am using the GossipingPropertyFileSnitch, in which you specify the DC and rack each node is in. As long as the nodes are configured with the same cluster name they'll discover each other and form the ring utilizing the gossip protocol.
Managing a Cassandra cluster can be accomplished by using the nodetool utility. Here are few examples of removing a live node, then re-adding it back to the cluster:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-node-2]$ nodetool -h localhost status | |
Datacenter: iad3 | |
================ | |
Status=Up/Down | |
|/ State=Normal/Leaving/Joining/Moving | |
-- Address Load Tokens Owns (effective) Host ID Rack | |
UN 10.176.65.71 283.62 KB 256 100.0% 50a36028-391b-4aad-a4b4-3cf2625f2211 rack1 | |
UN 10.176.64.41 243.39 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1 | |
[cassandra-node-2]$ nodetool info | |
ID : 50a36028-391b-4aad-a4b4-3cf2625f2211 | |
Gossip active : true | |
Thrift active : false | |
Native Transport active: true | |
Load : 253.09 KB | |
Generation No : 1464548032 | |
Uptime (seconds) : 1445 | |
Heap Memory (MB) : 156.59 / 978.00 | |
Off Heap Memory (MB) : 0.00 | |
Data Center : iad3 | |
Rack : rack1 | |
Exceptions : 0 | |
Key Cache : entries 20, size 1.66 KB, capacity 48 MB, 81 hits, 115 requests, 0.704 recent hit rate, 14400 save period in seconds | |
Row Cache : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds | |
Counter Cache : entries 0, size 0 bytes, capacity 24 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds | |
Token : (invoke with -T/--tokens to see all 256 tokens) | |
[cassandra-node-2]$ nodetool describecluster | |
Cluster Information: | |
Name: TestCluster | |
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch | |
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner | |
Schema versions: | |
34762ade-095c-3c37-8a04-4b6546170c78: [10.176.65.71, 10.176.64.41] | |
[cassandra-node-2]$ nodetool -h localhost decommission | |
[cassandra-node-2]$ nodetool -h localhost status | |
Datacenter: iad3 | |
================ | |
Status=Up/Down | |
|/ State=Normal/Leaving/Joining/Moving | |
-- Address Load Tokens Owns (effective) Host ID Rack | |
UN 10.176.64.41 263.69 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1 | |
[cassandra-node-2]$ /etc/init.d/cassandra stop | |
[cassandra-node-2]$ rm -rf /var/lib/cassandra/data/* | |
[cassandra-node-2]$ vim /etc/cassandra/jvm.options | |
-Dcassandra.replace_address=10.176.65.71 | |
[cassandra-node-2]$ /etc/init.d/cassandra start | |
[cassandra-node-1]$ tail -50 /var/log/cassandra/system.log | |
INFO [GossipStage:1] 2016-05-29 19:30:57,526 Gossiper.java:1009 - InetAddress /10.176.65.71 is now DOWN | |
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 19:30:57,531 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71 | |
INFO [HANDSHAKE-/10.176.65.71] 2016-05-29 19:30:58,126 OutboundTcpConnection.java:514 - Handshaking version with /10.176.65.71 | |
INFO [STREAM-INIT-/10.176.65.71:39575] 2016-05-29 19:31:29,617 StreamResultFuture.java:114 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb ID#0] Creating new streaming plan for Bootstrap | |
INFO [STREAM-INIT-/10.176.65.71:39575] 2016-05-29 19:31:29,618 StreamResultFuture.java:121 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Received streaming plan for Bootstrap | |
INFO [STREAM-INIT-/10.176.65.71:57654] 2016-05-29 19:31:29,618 StreamResultFuture.java:121 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Received streaming plan for Bootstrap | |
INFO [STREAM-IN-/10.176.65.71] 2016-05-29 19:31:29,702 StreamResultFuture.java:185 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Session with /10.176.65.71 is complete | |
INFO [STREAM-IN-/10.176.65.71] 2016-05-29 19:31:29,702 StreamResultFuture.java:217 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] All sessions completed | |
INFO [SharedPool-Worker-1] 2016-05-29 19:31:30,162 Gossiper.java:994 - InetAddress /10.176.65.71 is now UP | |
[cassandra-node-2]$ tail -50 /var/log/cassandra/system.log | |
INFO [pool-2-thread-1] 2016-05-29 19:30:54,741 AutoSavingCache.java:165 - Completed loading (41 ms; 4 keys) KeyCache cache | |
INFO [main] 2016-05-29 19:30:54,754 CommitLog.java:171 - Replaying /var/lib/cassandra/commitlog/CommitLog-6-1464548029096.log, /var/lib/cassandra/commitlog/CommitLog-6-1464548029097.log | |
INFO [main] 2016-05-29 19:30:55,744 CommitLog.java:173 - Log replay complete, 168 replayed mutations | |
INFO [main] 2016-05-29 19:30:55,745 StorageService.java:600 - Populating token metadata from system tables | |
INFO [main] 2016-05-29 19:30:55,776 StorageService.java:607 - Token metadata: Normal Tokens: | |
/10.176.64.41:[-9093342176872828671, ... ] | |
INFO [main] 2016-05-29 19:30:55,877 StorageService.java:618 - Cassandra version: 3.5 | |
INFO [main] 2016-05-29 19:30:55,894 StorageService.java:619 - Thrift API version: 20.1.0 | |
INFO [main] 2016-05-29 19:30:55,894 StorageService.java:620 - CQL supported versions: 3.4.0 (default: 3.4.0) | |
INFO [main] 2016-05-29 19:30:55,967 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 48 MB and a resize interval of 60 minutes | |
INFO [main] 2016-05-29 19:30:55,968 StorageService.java:639 - Loading persisted ring state | |
INFO [main] 2016-05-29 19:30:55,990 StorageService.java:522 - Gathering node replacement information for /10.176.65.71 | |
INFO [main] 2016-05-29 19:30:55,995 MessagingService.java:557 - Starting Messaging Service on /10.176.65.71:7000 (eth1) | |
INFO [MessagingService-Incoming-/10.176.64.41] 2016-05-29 19:30:56,041 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds | |
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 19:30:56,053 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 19:30:56,097 Gossiper.java:1028 - Node /10.176.64.41 has restarted, now UP | |
INFO [GossipStage:1] 2016-05-29 19:30:56,103 Gossiper.java:1009 - InetAddress /10.176.65.71 is now DOWN | |
INFO [SharedPool-Worker-1] 2016-05-29 19:30:56,107 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP | |
INFO [ScheduledTasks:1] 2016-05-29 19:30:56,280 TokenMetadata.java:448 - Updating topology for all endpoints that have changed | |
INFO [main] 2016-05-29 19:30:57,052 StorageService.java:828 - Starting up server gossip | |
INFO [main] 2016-05-29 19:30:57,195 StorageService.java:1323 - JOINING: waiting for ring information | |
INFO [GossipStage:1] 2016-05-29 19:30:58,119 Gossiper.java:1030 - Node /10.176.64.41 is now part of the cluster | |
INFO [GossipStage:1] 2016-05-29 19:30:58,121 StorageService.java:2081 - Node /10.176.64.41 state jump to NORMAL | |
INFO [SharedPool-Worker-1] 2016-05-29 19:30:58,135 Gossiper.java:994 - InetAddress /10.176.64.41 is now UP | |
INFO [HANDSHAKE-/10.176.64.41] 2016-05-29 19:30:58,142 OutboundTcpConnection.java:514 - Handshaking version with /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 19:30:58,146 TokenMetadata.java:429 - Updating topology for /10.176.64.41 | |
INFO [GossipStage:1] 2016-05-29 19:30:58,147 TokenMetadata.java:429 - Updating topology for /10.176.64.41 | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,510 ColumnFamilyStore.java:395 - Initializing system_traces.events | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,517 ColumnFamilyStore.java:395 - Initializing system_traces.sessions | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,534 ColumnFamilyStore.java:395 - Initializing system_distributed.parent_repair_history | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,540 ColumnFamilyStore.java:395 - Initializing system_distributed.repair_history | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,557 ColumnFamilyStore.java:395 - Initializing system_auth.resource_role_permissons_index | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,563 ColumnFamilyStore.java:395 - Initializing system_auth.role_members | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,570 ColumnFamilyStore.java:395 - Initializing system_auth.role_permissions | |
INFO [InternalResponseStage:1] 2016-05-29 19:30:58,584 ColumnFamilyStore.java:395 - Initializing system_auth.roles | |
WARN [GossipTasks:1] 2016-05-29 19:30:59,118 FailureDetector.java:287 - Not marking nodes down due to local pause of 6834867440 > 5000000000 | |
INFO [main] 2016-05-29 19:30:59,196 StorageService.java:1323 - JOINING: waiting for schema information to complete | |
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: schema complete, ready to bootstrap | |
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: waiting for pending range calculation | |
INFO [main] 2016-05-29 19:30:59,197 StorageService.java:1323 - JOINING: calculation complete, ready to bootstrap | |
INFO [main] 2016-05-29 19:31:29,198 StorageService.java:1323 - JOINING: Replacing a node with token(s): [-1101707182484276762, ... ] | |
INFO [main] 2016-05-29 19:31:29,236 StorageService.java:1323 - JOINING: Starting to bootstrap... | |
INFO [main] 2016-05-29 19:31:29,596 StreamResultFuture.java:88 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Executing streaming plan for Bootstrap | |
INFO [StreamConnectionEstablisher:1] 2016-05-29 19:31:29,607 StreamSession.java:237 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Starting streaming to /10.176.64.41 | |
INFO [StreamConnectionEstablisher:1] 2016-05-29 19:31:29,619 StreamCoordinator.java:264 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb, ID#0] Beginning stream session with /10.176.64.41 | |
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,697 StreamResultFuture.java:185 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] Session with /10.176.64.41 is complete | |
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,726 StreamResultFuture.java:217 - [Stream #f0f8c060-25d3-11e6-935c-d55c35b994bb] All sessions completed | |
INFO [STREAM-IN-/10.176.64.41] 2016-05-29 19:31:29,756 StorageService.java:1376 - Bootstrap completed! for the tokens [-1101707182484276762, ... ] | |
INFO [main] 2016-05-29 19:31:29,775 StorageService.java:2081 - Node /10.176.65.71 state jump to NORMAL | |
WARN [main] 2016-05-29 19:31:29,776 StorageService.java:2093 - Not updating token metadata for /10.176.65.71 because I am replacing it | |
INFO [main] 2016-05-29 19:31:29,788 AuthCache.java:172 - (Re)initializing CredentialsCache (validity period/update interval/max entries) (2000/2000/1000) | |
INFO [main] 2016-05-29 19:31:29,792 CassandraDaemon.java:639 - Waiting for gossip to settle before accepting client requests... | |
INFO [main] 2016-05-29 19:31:37,794 CassandraDaemon.java:670 - No gossip backlog; proceeding | |
INFO [main] 2016-05-29 19:31:37,855 NativeTransportService.java:70 - Netty using native Epoll event loop | |
INFO [main] 2016-05-29 19:31:37,903 Server.java:161 - Using Netty Version: [netty-buffer=netty-buffer-4.0.23.Final.208198c, netty-codec=netty-codec-4.0.23.Final.208198c, netty-codec-http=netty-codec-http-4.0.23.Final.208198c, netty-codec-socks=netty-codec-socks-4.0.23.Final.208198c, netty-common=netty-common-4.0.23.Final.208198c, netty-handler=netty-handler-4.0.23.Final.208198c, netty-transport=netty-transport-4.0.23.Final.208198c, netty-transport-rxtx=netty-transport-rxtx-4.0.23.Final.208198c, netty-transport-sctp=netty-transport-sctp-4.0.23.Final.208198c, netty-transport-udt=netty-transport-udt-4.0.23.Final.208198c] | |
INFO [main] 2016-05-29 19:31:37,903 Server.java:162 - Starting listening for CQL clients on /10.176.65.71:9042 (unencrypted)... | |
INFO [main] 2016-05-29 19:31:37,943 CassandraDaemon.java:471 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it | |
[cassandra-node-2]$ nodetool -h localhost status | |
Datacenter: iad3 | |
================ | |
Status=Up/Down | |
|/ State=Normal/Leaving/Joining/Moving | |
-- Address Load Tokens Owns (effective) Host ID Rack | |
UN 10.176.65.71 176.59 KB 256 100.0% 50a36028-391b-4aad-a4b4-3cf2625f2211 rack1 | |
UN 10.176.64.41 263.69 KB 256 100.0% 4b8377bf-4f85-415f-acae-575ee2cd69dd rack1 |
To take a snapshot of the data for backup run the following:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-node-2]$ nodetool -h localhost snapshot | |
Requested creating snapshot(s) for [all keyspaces] with snapshot name [1464548620814] and options {skipFlush=false} | |
Snapshot directory: 1464548620814 | |
[cassandra-node-2]$ ls -la /var/lib/cassandra/data/system_schema/indexes-0feb57ac311f382fba6d9024d305702f/snapshots/ | |
total 12 | |
drwxr-xr-x 3 cassandra cassandra 4096 May 29 19:03 . | |
drwxr-xr-x 4 cassandra cassandra 4096 May 29 19:03 .. | |
drwxr-xr-x 2 cassandra cassandra 4096 May 29 19:03 1464548620814 | |
[cassandra-node-2]$ nodetool -h localhost clearsnapshot | |
Requested clearing snapshot(s) for [all keyspaces] | |
[cassandra-node-2]$ ls -la /var/lib/cassandra/data/system_schema/types-5a8b1ca866023f77a0459273d308917a/snapshots/ | |
ls: cannot access /var/lib/cassandra/data/system_schema/types-5a8b1ca866023f77a0459273d308917a/snapshots/: No such file or directory |
Once the snapshot is complete you can copy it to a secure location. To enable incremental backups edit cassandra.yaml and ensure "incremental_backups: true". To restore the data, stop Cassandra, clean up the data (only delete the *.db files) and commitlog directories and copy the files over.
And finally let's manipulate some data, by adding a new user, changing the default cassandra password and creating and inserting records:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[cassandra-node-1]$ cqlsh 10.176.65.71 -ucassandra -pcassandra | |
Connected to TestCluster at 10.176.65.71:9042. | |
[cqlsh 5.0.1 | Cassandra 3.5 | CQL spec 3.4.0 | Native protocol v4] | |
Use HELP for help. | |
cassandra@cqlsh> | |
cassandra@cqlsh> list users; | |
name | super | |
-----------+------- | |
cassandra | True | |
(1 rows) | |
cassandra@cqlsh> CREATE USER root WITH PASSWORD 'supersecretpassword' SUPERUSER; | |
cassandra@cqlsh> list users; | |
name | super | |
-----------+------- | |
cassandra | True | |
root | True | |
(2 rows) | |
cassandra@cqlsh> ALTER USER cassandra WITH PASSWORD 'supersecretpassword'; | |
cassandra@cqlsh> describe KEYSPACES ; | |
system_auth system system_distributed system_traces system_schema | |
cassandra@cqlsh> CREATE KEYSPACE IF NOT EXISTS test_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'iad3':2}; | |
cassandra@cqlsh> use test_keyspace ; | |
cassandra@cqlsh:test_keyspace> CREATE TABLE test_table (id int PRIMARY KEY, name varchar, enabled boolean); | |
cassandra@cqlsh:test_keyspace> DESCRIBE table test_table; | |
CREATE TABLE test_keyspace.test_table ( | |
id int PRIMARY KEY, | |
enabled boolean, | |
name text | |
) WITH bloom_filter_fp_chance = 0.01 | |
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} | |
AND comment = '' | |
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} | |
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} | |
AND crc_check_chance = 1.0 | |
AND dclocal_read_repair_chance = 0.1 | |
AND default_time_to_live = 0 | |
AND gc_grace_seconds = 864000 | |
AND max_index_interval = 2048 | |
AND memtable_flush_period_in_ms = 0 | |
AND min_index_interval = 128 | |
AND read_repair_chance = 0.0 | |
AND speculative_retry = '99PERCENTILE'; | |
cassandra@cqlsh:test_keyspace> INSERT INTO test_table (id, name, enabled) VALUES ( 1, 'Konstantin', true ); | |
cassandra@cqlsh:test_keyspace> select * from test_table ; | |
id | enabled | name | |
----+---------+------------ | |
1 | True | Konstantin | |
(1 rows) | |
cassandra@cqlsh:test_keyspace> describe schema; | |
CREATE KEYSPACE test_keyspace WITH replication = {'class': 'NetworkTopologyStrategy', 'iad3': '2'} AND durable_writes = true; | |
CREATE TABLE test_keyspace.test_table ( | |
id int PRIMARY KEY, | |
enabled boolean, | |
name text | |
) WITH bloom_filter_fp_chance = 0.01 | |
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} | |
AND comment = '' | |
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} | |
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} | |
AND crc_check_chance = 1.0 | |
AND dclocal_read_repair_chance = 0.1 | |
AND default_time_to_live = 0 | |
AND gc_grace_seconds = 864000 | |
AND max_index_interval = 2048 | |
AND memtable_flush_period_in_ms = 0 | |
AND min_index_interval = 128 | |
AND read_repair_chance = 0.0 | |
AND speculative_retry = '99PERCENTILE'; | |
cassandra@cqlsh:test_keyspace> quit | |
[cassandra-node-1]$ ls -lah /var/lib/cassandra/data/test_keyspace/test_table-59271a3025d711e6935cd55c35b994bb/ | |
total 12K | |
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 . | |
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 .. | |
drwxr-xr-x 2 cassandra cassandra 4.0K May 29 19:55 backups | |
[cassandra-node-1]$ nodetool flush | |
[cassandra-node-1]$ ls -lah /var/lib/cassandra/data/test_keyspace/test_table-59271a3025d711e6935cd55c35b994bb/ | |
total 48K | |
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 20:07 . | |
drwxr-xr-x 3 cassandra cassandra 4.0K May 29 19:55 .. | |
drwxr-xr-x 2 cassandra cassandra 4.0K May 29 19:55 backups | |
-rw-r--r-- 1 cassandra cassandra 43 May 29 20:07 ma-1-big-CompressionInfo.db | |
-rw-r--r-- 1 cassandra cassandra 44 May 29 20:07 ma-1-big-Data.db | |
-rw-r--r-- 1 cassandra cassandra 9 May 29 20:07 ma-1-big-Digest.crc32 | |
-rw-r--r-- 1 cassandra cassandra 16 May 29 20:07 ma-1-big-Filter.db | |
-rw-r--r-- 1 cassandra cassandra 8 May 29 20:07 ma-1-big-Index.db | |
-rw-r--r-- 1 cassandra cassandra 4.6K May 29 20:07 ma-1-big-Statistics.db | |
-rw-r--r-- 1 cassandra cassandra 56 May 29 20:07 ma-1-big-Summary.db | |
-rw-r--r-- 1 cassandra cassandra 92 May 29 20:07 ma-1-big-TOC.txt |
Resources:
[1]. http://cassandra.apache.org/
[2]. https://en.wikipedia.org/wiki/ACID
[3]. https://en.wikipedia.org/wiki/Eventual_consistency
[4]. https://en.wikipedia.org/wiki/CAP_theorem
[5]. docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureIntro_c.html