Replacing a node

Once in a while, a dead node needs to be replaced. This means you just wanted an exact replacement instead of just removal. Here are the steps:

  1. Note down the dead node's token.
  2. Set the new node's initial token as the dead node's token minus one. This node is going to own the new data, so you must make sure that the data directories are empty to avoid any conflict.
  3. Configure cassandra.yaml appropriately. Similar to the way we did when adding a new node. (Refer to the Adding nodes to a cluster section in this chapter.)
  4. Let the bootstrap complete and see the node appear in the nodetool ring listing.
  5. Perform a nodetool repair for each keyspace for integrity.
  6. Perform nodetool removetoken for the old node.

Let us see this in action. The example cluster here has three nodes and a replication factor of 2. One of the nodes is down. We will replace it with a new node.

Here is what the current ring looks like:

$ bin/nodetool -h 10.99.9.67 ring 

Address          Status   EO*       Token 

                                    113427455640312821154458202477256070484 
10.99.9.67       Up       66.67%    0
10.147.171.159   Up       66.67%    56713727820156410577229101238628035242 
10.114.189.54    Down     66.67%    113427455640312821154458202477256070484

# * EO stands for Effective-ownership

As you can see, we need to replace the third node, 10.114.189.54. We fired up a new machine, installed Cassandra, altered cassandra.yaml to match the cluster specifications, and set up the listen address and data directory. We also made sure that the data directories (commit log, saved caches, and data) are blank. Since this node is going to replace a node with token 113427455640312821154458202477256070484, we are setting the new node's initial_token as 113427455640312821154458202477256070483. By default auto_bootstrap is true, which is good.

# cassandra.yaml on replacement node
initial_token: 113427455640312821154458202477256070483

We start the Cassandra service on the node. Looking at the logs, it seems it has got all the information it needed:

# Cassandra log, joining the cluster with dead-node
…
 INFO 08:30:34,841 JOINING: schema complete 
 INFO 08:30:34,842 JOINING: waiting for pending range calculation 
 INFO 08:30:34,842 JOINING: calculation complete, ready to bootstrap 
 INFO 08:30:34,843 JOINING: getting bootstrap token 
 INFO 08:30:34,849 Enqueuing flush of Memtable-LocationInfo@2039197494(36/45 serialized/live bytes, 1 ops) 
 INFO 08:30:34,849 Writing Memtable-LocationInfo@2039197494(36/45 serialized/live bytes, 1 ops)
 INFO 08:30:34,873 Completed flushing /mnt/cassandra-data/data/system/LocationInfo/system-LocationInfo-hf-6-Data.db (87 bytes) for commitlog position ReplayPosition(segmentId=1371630629459, position=30209) 
 INFO 08:30:34,875 JOINING: sleeping 30000 ms for pending range setup 
 INFO 08:30:43,856 InetAddress /10.114.189.54 is now dead. 
INFO 08:31:04,875 JOINING: Starting to bootstrap...
 INFO 08:31:05,685 Finished streaming session 4 from /10.147.171.159 
 INFO 08:31:09,108 Finished streaming session 3 from /10.99.9.67 
 INFO 08:31:10,613 Finished streaming session 1 from /10.99.9.67 
 INFO 08:31:10,622 Finished streaming session 2 from /10.147.171.159 
…

Now that the new node has joined, the nodetool ring still does not seem to be right.

$ bin/nodetool -h 10.99.9.67 ring  

Address     Status  EO*     Token 
                                113427455640312821154458202477256070484 
10.99.9.67      Up      33.33%  0 
10.147.171.159  Up      66.67%  56713727820156410577229101238628035242 
10.166.54.134   Up      66.67%  113427455640312821154458202477256070483 
10.114.189.54   Down    33.33%  113427455640312821154458202477256070484 

# * EO stands for Effective-ownership

This means we need to remove the dead node from the cluster. But before we go ahead and remove the node, let's just repair the keyspaces to make sure that the nodes are consistent.

$ bin/nodetool -h 10.99.9.67 repair Keyspace1

[2013-06-19 08:40:55,336] Starting repair command #1, repairing 2 ranges for keyspaceKeyspace1

[2013-06-19 08:41:01,297] Repair session f4e1f830-d8bb-11e2-0000-23f6cbfa94fd for range (113427455640312821154458202477256070484,0] finished 

[2013-06-19 08:41:01,298] Repair session f86bbb30-d8bb-11e2-0000-23f6cbfa94fd for range (113427455640312821154458202477256070483,113427455640312821154458202477256070484] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/10.114.189.54) is dead: session failed 

[2013-06-19 08:41:01,298] Repair command #1 finished 

$ bin/nodetool -h 10.99.9.67 repair mytest

[2013-06-19 08:41:25,867] Starting repair command #2, repairing 2 ranges for keyspacemytest

[2013-06-19 08:41:26,377] Repair session 0712f3b0-d8bc-11e2-0000-23f6cbfa94fd for range (113427455640312821154458202477256070484,0] finished 

[2013-06-19 08:41:26,377] Repair session 075ea2b0-d8bc-11e2-0000-23f6cbfa94fd for range (113427455640312821154458202477256070483,113427455640312821154458202477256070484] failed with error java.io.IOException: Cannot proceed on repair because a neighbor (/10.114.189.54) is dead: session failed 

[2013-06-19 08:41:26,377] Repair command #2 finished

Now, let's remove the dead node. We will use the same technique of nodetool removetoken as we did in the Removing a dead node section in this chapter. The cluster looks good after removal.

# Remove dead node
$ bin/nodetool -h 10.99.9.67 removetoken 113427455640312821154458202477256070484 

# Ring status
$ bin/nodetool -h 10.99.9.67 ring 

Address      Status  EO*      Token  
                                 113427455640312821154458202477256070483  
10.99.9.67      Up      66.67%   0  
10.147.171.159  Up      66.67%   56713727820156410577229101238628035242  
10.166.54.134   Up      66.67%   113427455640312821154458202477256070483 

# * EO stands for Effective-ownership

If, for some reason, you are unable to perform the replacement using the previous method, there is an alternative approach. Here are the steps to replace a node with a new one:

  1. Set the new (replacement) node's IP the same as the one that died.
  2. Configure cassandra.yaml appropriately.
  3. Set initial_token the same as the token assigned to the dead node. Start the substitute node.

The cluster will assume that the dead node came alive. Run nodetool repair on all the keyspaces. This will stream the data to the new node.

In Cassandra 1.0 and onwards, a dead node can be replaced with a new node by the property cassandra.replace_token=<Token>. Set this property using the -D option while starting Cassandra. Make sure the data directories are empty on the new node and run a nodetool repair after the node is up.

The two alternative approaches (inserting a node with a new token versus replacing the node with the same token) have different perspectives of how the fix is made. The former approach inserts a node between the dead node and the previous node. We leave just one token to the dead node. This one token gets assigned to the node next to the dead node when we remove the dead node from the ring. The latter approach, however, is like saying the dead node came back to life but lost its memory. So, the replica nodes fill it in. There is no specific preference on which method one should prefer. Choose the one convenient to you.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.163.180