CRUSH map internals

To know what's inside a CRUSH map, we need to extract it and decompile it to convert it into a human-readable form and for easy editing. We can perform all the necessary changes to a CRUSH map at this stage, and to make the changes take effect, we should compile and inject it back to the Ceph cluster. The change to Ceph clusters by the new CRUSH map is dynamic, that is, once the new CRUSH map is injected into the Ceph cluster, the change will come into effect immediately on the fly. We will now take a look at the CRUSH map of our Ceph cluster that we deployed in this book.

Extract the CRUSH map from any of the monitor nodes:

# ceph osd getcrushmap -o crushmap_compiled_file

Once you have the CRUSH map, decompile it to make it human readable and editable:

# crushtool -d crushmap_compiled_file -o crushmap_decompiled_file

At this point, the output file, crushmap_decompiled_file, can be viewed/edited in your favorite editor.

In the next section, we will learn how to perform changes to a CRUSH map.

Once the changes are done, you should compile the changes with -c command option:

# crushtool -c crushmap_decompiled_file -o newcrushmap

Finally, inject the new compiled CURSH map into the Ceph cluster with -i command option:

# ceph osd setcrushmap  -i newcrushmap

A CRUSH map file contains four sections; they are as follows:

  • Crush map devices: The device section contains a list of all the OSDs present in a Ceph cluster. Whenever any new OSD is added or removed from a Ceph cluster, the CRUSH map's devices section is updated automatically. Usually, you do not require changing this section; Ceph takes care of this. However, if you need to add a new device, add a new line at the end of the device section with a unique device number followed by the OSD. The following screenshot shows the devices section of a CRUSH map from our sandbox Ceph cluster:
    CRUSH map internals
  • Crush map bucket types: This section defines the types of buckets that can be used in a CRUSH map. The default CRUSH map contains several bucket types, which is usually enough for most of the Ceph clusters. However, based on your requirement, you can add or remove bucket types from this section. To add a bucket type, add a new line in the bucket type section of the CRUSH map file, enter the type, and type the ID (next numeric number) followed by the bucket name. The default bucket list from our sandbox Ceph cluster looks like the following:
    CRUSH map internals
  • Crush map bucket definition: Once the bucket type is declared, it is defined for hosts and other failure domains. In this section, you can do hierarchical changes to your Ceph cluster architecture, for example, defining hosts, row, racks, chassis, room, and datacenter. You can also define what type of algorithm the bucket should use. The bucket definition contains several parameters; you can use the following syntax to create a bucket definition:
    [bucket-type] [bucket-name] {
            id [a unique negative numeric ID]
            weight [the relative capacity/capability of the item(s)]
            alg [the bucket type: uniform | list | tree | straw ]
            hash [the hash type: 0 by default]
            item [item-name] weight [weight]
    }

    The following is the bucket definition section from our sandbox Ceph cluster:

    CRUSH map internals
  • Crush map rules: It defines the way to select an appropriate bucket for data placement in pools. For a larger Ceph cluster, there might be multiple pools, and each pool will have its one CRUSH ruleset. CRUSH map rules require several parameters; you can use the following syntax to create a CRUSH ruleset:
    rule <rulename> {
    ruleset <ruleset>
            type [ replicated | raid4 ]
            min_size <min-size>
            max_size <max-size>
            step take <bucket-type>
            step [choose|chooseleaf] [firstn|indep] <N> <bucket-type>
            step emit
    }
    CRUSH map internals
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.44.229