Recovering from a complete monitor failure

In the a unlikely event that you lose all of your monitors, all is not lost. You can rebuild the monitor database from the contents of the OSDs by the use of the ceph-objectstore tool.

To set the scenario, we will assume that an event has occurred and has corrupted all three monitors, effectively leaving the Ceph cluster inaccessible. To recover the cluster, we will shut down two of the monitors and leave a single failed monitor running. We will then rebuild the monitor database, overwrite the corrupted copy, and then restart the monitor to bring the Ceph cluster back online.

The objectstore tool needs to be able to access every OSD in the cluster to rebuild the monitor database; in this example, we will use a script, which will connect via ssh to access the OSD data. As the OSD data is not accessible by every user, we will use the root user to log in to the OSD hosts. By default, most Linux distributions will not allow remote, password-based root logins, so ensure you have copied your public ssh key to the root users on some remote OSD nodes.

The following script will connect to each of the OSD nodes specified in the hosts variable, and it will extract the data required to build the monitor database:

#!/bin/bash
hosts="osd1 osd2 osd3"
ms=/tmp/mon-store/
mkdir $ms
# collect the cluster map from OSDs
for host in $hosts; do
  echo $host
  rsync -avz $ms root@$host:$ms
  rm -rf $ms
  ssh root@$host <<EOF
    for osd in /var/lib/ceph/osd/ceph-*; do
      ceph-objectstore-tool --data-path $osd --op update-mon-db --mon-store-path $ms
    done
EOF
  rsync -avz root@$host:$ms $ms
done

This will generate the following contents in the /tmp/mon-store directory:

We also need to assign new permissions via the keyring:

sudo ceph-authtool /etc/ceph/ceph.client.admin.keyring --create-keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *'

sudo ceph-authtool /etc/ceph/ceph.client.admin.keyring --gen-key -n mon. --cap mon 'allow *'

sudo cat /etc/ceph/ceph.client.admin.keyring

Now that the monitor database is rebuilt, we can copy it to the monitor directory, but before we do so, let's take a backup of the existing database:

sudo mv /var/lib/ceph/mon/ceph-mon1/store.db /var/lib/ceph/mon/ceph-mon1/store.bak

Now, copy the rebuilt version:

sudo mv /tmp/mon-store/store.db /var/lib/ceph/mon/ceph-mon1/store.db
sudo chown -R ceph:ceph /var/lib/ceph/mon/ceph-mon1

If you try and start the monitor now, it will get stuck in a probing state, as it tries to probe for other monitors. This is Ceph trying to avoid a split-brain scenario; however in this case, we want to force it to form a quorum and go fully online. To do this, we need to edit monmap, remove the other monitors, and then inject it back into the monitors database:

sudo ceph-mon -i mon1 --extract-monmap /tmp/monmap

Check the contents of the monmap:

sudo monmaptool /tmp/monmap –print

You will see that there are three mons present, so let's remove two of them:

sudo monmaptool /tmp/monmap --rm noname-b

sudo monmaptool /tmp/monmap --rm noname-c

Now, check again to make sure they are completely gone:

sudo monmaptool /tmp/monmap –print

sudo ceph-mon -i mon1 --inject-monmap /tmp/monmap

Restart all your OSDs, so they rejoin the cluster; then you will be able to successfully query the cluster status and see that your data is still there:

This concludes the section of this chapter on how to recover from a complete monitor failure.

Table of Contents for Recovering from a complete monitor failure

Create new playlist

Sign In

Sign Up

Table of Contents for
Recovering from a complete monitor failure