Different pools on different OSDs

Ceph seamlessly runs on heterogeneous commodity hardware. There are possibilities that you can use for your existing hardware systems for Ceph and develop a storage cluster of different hardware types. As a Ceph storage administrator, your use case might require creating multiple Ceph pools on multiple types of drives. The most common use case is to provide a fast storage pool based on SSD disk types where you can get high performance out of your storage cluster. The data, which do not require higher I/O is usually stored on pools backed by slower magnetic drives.

Our next hands-on demonstration will focus on creating two Ceph pools, ssd-pools backed by faster SSD disks, and sata-pools backed by slower SATA disks. To achieve this, we will edit CRUSH maps and do the necessary configurations.

The sandbox Ceph cluster that we deployed in the earlier chapters is hosted on virtual machines and does not have real SSD disks backing it. Hence, we will be considering a few disks as SSD disks for learning purposes. If you perform this exercise on real SSD disk-based Ceph clusters, there will be no changes in the steps we will perform. You should be able to use the same steps without any modifications.

In the following demonstration, we assume that ceph-node1 is our SSD node hosting three SSDs. The ceph-node2 and ceph-node3 nodes host SATA disks. We will modify the default CRUSH map and create two pools, namely SSD and SATA. The SSD pool's primary copy will be hosed on ceph-node1, while the secondary and tertiary copies will be on other nodes. Similarly, SATA pool's primary copy will be on either ceph-node2 or ceph-node3 as we have two nodes backing the SATA pool. At any step of this demonstration, you can refer to the updated CRUSH map file provided with this book on the Packt Publishing website.

Extract the CRUSH map from any of the monitor nodes and compile it:

# ceph osd getcrushmap -o crushmap-extract
# crushtool -d crushmap-extract -o crushmap-decompiled
Different pools on different OSDs

Use your favorite editor to edit the default CRUSH map:

# vi crushmap-decompiled

Replace the root default bucket with the root ssd and root sata buckets. Here, the root ssd bucket contains one item, ceph-node1. Similarly, the root sata bucket has two hosts defined. Have a look at the following screenshot:

Different pools on different OSDs

Adjust the existing rules to work with new buckets. For this, change step take default to step take sata for data, metadata, and RBD rules. This will instruct these rules to use the sata root bucket instead of the default root bucket as we removed it in the previous step.

Finally, add new rules for the ssd and sata pools, as shown in the following screenshot:

Different pools on different OSDs

Once the changes are done, compile the CRUSH file and inject it back to the Ceph cluster:

# crushtool -c crushmap-decompiled -o crushmap-compiled
# ceph osd setcrushmap -i crushmap-compiled

As soon as you inject the new CRUSH map to the Ceph cluster, the cluster will undergo data reshuffling and data recovery, but it should attain the HEALTH_OK status soon. Check the status of your cluster as follows:

Different pools on different OSDs

Once your cluster is healthy, create two pools for ssd and sata:

# ceph osd pool create sata 64 64
# ceph osd pool create ssd 64 64

Assign crush_ruleset for the sata and ssd rules, as defined in the CRUSH map:

# ceph osd pool set sata crush_ruleset 3
# ceph osd pool set ssd crush_ruleset 4
# ceph osd dump | egrep -i "ssd|sata"
Different pools on different OSDs

To test these newly created pools, we will put some data on them and verify which OSD the data gets stored on. Create some data files:

# dd if=/dev/zero of=sata.pool bs=1M count=32 conv=fsync
# dd if=/dev/zero of=ssd.pool bs=1M count=32 conv=fsync
Different pools on different OSDs

Put these files to Ceph storage on respective pools:

#  rados -p ssd put ssd.pool.object  ssd.pool
#  rados -p sata put sata.pool.object  sata.pool
Finally check the OSD map for pool objects 
# ceph osd map ssd ssd.pool.object
# ceph osd map sata sata.pool.object
Different pools on different OSDs

Let's diagnose the preceding output; the first output for the SSD pool represents the object's primary copy that is located on osd.2; the other copies are located on osd.5 and osd.6. This is because of the way we configured our CRUSH map. We defined an SSD pool to use ceph-node1, which contains osd.0, osd.1, and osd.2.

This is just a basic demonstration for custom CRUSH maps. You can do a lot of things with CRUSH. There are a lot of possibilities for effectively and efficiently managing all the data of your Ceph cluster using custom CRUSH maps.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.144.56