The ceph-objectstore tool

One of the key features of Ceph is its self-repairing and self-healing qualities. Ceph does this by keeping multiple copies of placement groups across different OSDs and ensures very high probability that you will not lose your data. In a very rare case, you may see the failure of multiple OSDs, where one or more PG replicas are on a failed OSD, and the PG state becomes incomplete, which leads to errors in the cluster health. For granular recovery, Ceph provides a low level PG and object data recovery tool known as ceph-objectstore-tool.

How to do it…

The ceph-objectstore-tool can be a risky operation, and the command needs to be run either as root or sudo. Do not attempt this on a production cluster without engaging the Red Hat Ceph Storage Support, unless you are sure of what you are doing. It can cause irreversible data loss in your cluster.

  1. Find incomplete PGs on your Ceph cluster. Using this command, you can get the PG id and its acting set:
    # ceph health detail | grep incomplete
    
  2. Using the acting set, locate the OSD host:
    # ceph osd find <osd_number>
    
  3. Log in to the OSD node and stop the OSD that you intend to work on:
    # service ceph stop <osd_ID>
    

The following sections describe the OSD and placement group functions that you can use with the ceph-objectstore-tool:

  1. To identify the objects within an OSD, execute the following. The tool will output all objects, irrespective of their placement groups:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op list
    
  2. To identify the objects within a placement group, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pgid> --op list
    
  3. To list the placement groups stored on an OSD, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op list-pgs
    
  4. If you know the object ID that you are looking for, specify it to find the PG ID:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op list <object-id>
    
  5. Retrieve information about a particular placement group:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pg-id> --op info
    
  6. Retrieve a log of operations on a placement group:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pg-id> --op log
    

Removing a placement group is a risky operation and may cause data loss; use this feature with caution. If you have a corrupt placement group on an OSD that prevents the peering or starting of the OSD service, before removing the placement group, ensure that you have a valid copy of the placement group on another OSD. For precaution, before removing the PG, you can also take a backup of the PG by exporting it to a file:

  1. To remove a placement group, execute the following command:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pg-id> --op remove
    
  2. To export a placement group to a file, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pg-id> --file /path/to/file --op export
    
  3. To import a placement group from a file, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --file </path/to/file> --op import
    
  4. An OSD may have objects marked as "lost." To list the "lost" or "unfound" objects, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op list-lost
    
  5. To find objects marked as lost for a single placement group, specify pgid:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pgid> --op list-lost
    
  6. The ceph-objectstore tool is purposely used to fix the PG's lost objects. An OSD may have objects marked "lost." To remove the "lost" setting for the lost objects of a placement group, execute the following:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op fix-lost
    
  7. To fix lost objects for a particular placement group, specify pgid:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --pgid <pg-id> --op fix-lost
    
  8. If you know the identity of the lost object you want to fix, specify the object ID:
    # ceph-objectstore-tool --data-path </path/to/osd> --journal-path </path/to/journal> --op fix-lost <object-id>
    

How it works…

The syntax for the ceph-objectstore tool is: ceph-objectstore-tool <options>

The values for <options> can be as follows:

  • --data-path: The path to the OSD
  • --journal-path: The path to the journal
  • --op: The operation
  • --pgid: The Placement Group ID
  • --skip-journal-replay: Use this when the journal is corrupted
  • --skip-mount-omap: Use this when the leveldb data store is corrupted and unable to mount
  • --file: The path to the file, used with the import/export operation

To understand this tool better, let's take an example: a pool makes two copies of an object, and PGs are located on osd.1 and osd.2. At this point, if failure happens, the following sequence will occur:

  1. osd.1 goes down.
  2. osd.2 handles all the write operations in a degraded state.
  3. osd.1 comes up and peers with osd.2 for data replication.
  4. Suddenly, osd.2 goes down before replicating all the objects to osd.1.
  5. At this point, you have data on osd.1, but it's stale.

After troubleshooting, you will find that you can read the osd.2 data from the file system, but its osd service is not getting started. In such a situation, one should use the ceph-objectstore-tool to export/retrieve data from the failed osd. The ceph-objectstore-tool provides you with enough capability to examine, modify, and retrieve object data and metadata.

Note

You should avoid using Linux tools such as cp and rsync for recovering data from a failed OSD, as these tools do not take all the necessary metadata into account, and the recovered object might be unusable.

Finally, we have reached the end of this chapter and the book as well. I hope your journey with the Ceph cookbook has been informative. You should have learned several concepts around Ceph that will give you enough confidence to operate the Ceph cluster in your environment. Congratulations! You have now attained the next level in Ceph.

Keep Learning, Keep Exploring, Keep Sharing…

Cheers!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.104.27