Understanding the CRUSH mechanism

When it comes to data storage and management, Ceph uses the CRUSH algorithm, which is the intelligent data distribution mechanism of Ceph. As we discussed in the last recipe, traditional storage systems use a central metadata/index table to know where the user's data is stored. Ceph, on the other hand, uses the CRUSH algorithm to deterministically compute where the data should be written to or read from. Instead of storing metadata, CRUSH computes metadata on demand, thus removing the need for a centralized server/gateway or broker. It empowers Ceph clients to compute metadata, also known as CRUSH lookup, and communicates with OSDs directly.

For a read-and-write operation to Ceph clusters, clients first contact a Ceph monitor and retrieve a copy of the cluster map, which is inclusive of 5 maps, namely the monitor, OSD, MDS, and CRUSH and PG maps; we will cover these maps later in this chapter. These cluster maps help clients know the state and configuration of the Ceph cluster. Next, the data is converted to objects using an object name and pool names/IDs. This object is then hashed with the number of PGs to generate a final PG within the required Ceph pool. This calculated PG then goes through a CRUSH lookup function to determine the primary, secondary, and tertiary OSD locations to store or retrieve data.

Once the client gets the exact OSD ID, it contacts the OSDs directly and stores the data. All of these compute operations are performed by the clients; hence, they do not affect the cluster performance. The following diagram illustrates the entire process:

Understanding the CRUSH mechanism
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.35.193