Ceph cache tiering

Like erasure coding, the cache tiering feature has also been introduced in the Ceph Firefly release, and it has been one of the most talked about features of Ceph Firefly. Cache tiering creates a Ceph pool that will be constructed on top of faster disks, typically SSDs. This cache pool should be placed in front of a regular, replicated, or erasure pool such that all the client I/O operations are handled by the cache pool first; later, the data is flushed to existing data pools.

The clients enjoy high performance out of the cache pool, while their data is written to regular pools transparently.

Ceph cache tiering

Generally, a cache tier is constructed on top of expensive/faster SSD disks, thus it provides clients with better I/O performance. The cache pool is backed up by a storage tier, which is made up of HDDs with type replicated or erasure. In this type of setup, clients submit I/O requests to the cache pool and get instant responses for their requests, whether it's a read or write; the faster cache tier serves the client request. After a while, the cache tier flushes all its data to the backing storage tier so that it can cache new requests from clients. All the data migration between the cache and storage tiers happens automatically and is transparent to clients. Cache tiering can be configured in two modes.

The writeback mode

When Ceph cache tiering is configured as a writeback mode, a Ceph client writes the data to the cache tier pool, that is, to the faster pool, and hence receives acknowledgement instantly. Based on the flushing/evicting policy that you have set for your cache tier, data is migrated from the cache tier to the storage tier, and eventually removed from the cache tier by a cache-tiering agent. During a read operation by the client, data is first migrated from the storage tier to the cache tier by the cache-tiering agent, and it is then served to clients. The data remains in the cache tier until it becomes inactive or cold.

The read-only mode

When Ceph cache tiering is configured as a read-only mode, it works only for a client's read operations. The client's write operation does not involve cache tiering, rather, all the client writes are done on the storage tier. During read operations by clients, a cache-tiering agent copies the requested data from the storage tier to the cache tier. Based on the policy that you have configured for the cache tier, stale objects are removed from them. This approach is idle when multiple clients need to read large amounts of similar data.

Implementing cache tiering

A cache tier is implemented on faster physical disks, generally SSDs, which makes a fast cache layer on top of slower regular pools made up of HDD. In this section, we will create two separate pools, a cache pool and a regular pool, which will be used as cache tier and storage tier, respectively:

Implementing cache tiering

Creating a pool

In Chapter 7, Ceph Operations and Maintenance, we discussed the process of creating Ceph pools on top of specific OSDs by modifying a CRUSH map. Similarly, we will create a cache pool, which will be based on osd.0, osd.3, and osd.6. Since we do not have real SSDs in this setup, we will assume the OSDs as SSDs and create a cache pool on top of it. The following are the instructions to create a cache pool on osd.0, osd.3, and osd.4:

  1. Get the current CRUSH map and decompile it:
    # ceph osd getcrushmap -o crushmapdump
    # crushtool -d crushmapdump -o crushmapdump-decompiled
    
  2. Edit the decompiled CRUSH map file and add the following section after the root default section:
    # vim crushmapdump-decompiled
    root cache {
                id -5
                alg straw
                hash 0
                item osd.0 weight 0.010
                item osd.3 weight 0.010
                item osd.6 weight 0.010
    }
    

    Tip

    You should change the CRUSH map layout based on your environment.

  3. Create the CRUSH rule by adding the following section under the rules section, generally at the end of the file. Finally, save and exit the CRUSH map file:
    rule cache-pool {
                     ruleset 4
                     type replicated
                     min_size 1
                     max_size 10
                     step take cache
                     step chooseleaf firstn 0 type osd
                     step emit
    }
    
  4. Compile and inject the new CRUSH map to the Ceph cluster:
    # crushtool -c crushmapdump-decompiled -o crushmapdump-compiled
    # ceph osd setcrushmap -i crushmapdump-compiled
    
  5. Once the new CRUSH map has been applied to the Ceph cluster, you should check the OSD status to view new OSD arrangements. You will find a new bucket root cache:
    # ceph osd tree
    
    Creating a pool
  6. Create a new pool and set crush_ruleset as 4 so that the new pool gets created on SSD disks:
    # ceph osd pool create cache-pool 32 32
    # ceph osd pool set cache-pool crush_ruleset 4
    
    Creating a pool

    Tip

    We do not have real SSDs; we are assuming osd.0, osd.3, and osd.6 as SSDs for this demonstration.

  7. Make sure your pool is created correctly, that is, it should always store all the objects on osd.0, osd.3, and osd.6:
    • List the cache-pool for contents; since it's a new pool, it should not have any content:
      # rados -p  cache-pool ls
      
    • Add a temporary object to the cache-pool to make sure that it's storing the object on the correct OSD:
      # rados -p cache-pool put object1 /etc/hosts
      
    • List the contents of the cache-pool:
      # rados -p cache-pool ls
      
    • Check the OSD map for cache-pool and object1. If you configured the CRUSH map correctly, object1 should get stored on osd.0, osd.3, and osd.6 as its replica size is 3:
      # ceph osd map cache-pool object1
      
    • Remove the object:
      # rados -p cache-pool rm  object1
      
    Creating a pool

Creating a cache tier

In the previous section, we created a pool based on SSDs; we will now use this pool as a cache tier for an erasure-coded pool named EC-pool that we created earlier in this chapter.

The following instructions will guide you through creating a cache tier with the writeback mode and setting the overlay with an EC-pool:

  1. Set up a cache tier that will associate storage pools with cache-pools. The syntax for this command is ceph osd tier add <storage_pool> <cache_pool>:
    # ceph osd tier add EC-pool cache-pool
    
  2. Set the cache mode as either writeback or read-only. In this demonstration, we will use writeback, and the syntax for this is # ceph osd tier cache-mode <cache_pool> writeback:
    # ceph osd tier cache-mode cache-pool writeback
    
  3. To direct all the client requests from the standard pool to the cache pool, set the pool overlay, and the syntax for this is # ceph osd tier set-overlay <storage_pool> <cache_pool>:
    # ceph osd tier set-overlay EC-pool cache-pool
    
    Creating a cache tier
  4. On checking the pool details, you will notice that the EC-pool has tier, read_tier, and write_tier set as 16, which is the pool ID for cache-pool.

    Similarly, for cache-pool, the settings will be tier_of set as 15 and cache_mode as writeback; all these settings imply that the cache pool is configured correctly:

    # ceph osd dump | egrep -i "EC-pool|cache-pool"
    
    Creating a cache tier

Configuring a cache tier

A cache tier has several configuration options; you should configure your cache tier in order to set policies for it. In this section, we will configure cache tier policies:

  1. Enable hit set tracking for the cache pool; the production-grade cache tier uses bloom filters:
    # ceph osd pool set cache-pool hit_set_type bloom
    
  2. Enable hit_set_count, which is the number of hits set to store for a cache pool:
    # ceph osd pool set cache-pool hit_set_count 1
    
  3. Enable hit_set_period, which is the duration of the hit set period in seconds to store for a cache pool:
    # ceph osd pool set cache-pool hit_set_period 300
    
  4. Enable target_max_bytes, which is the maximum number of bytes after the cache-tiering agent starts flushing/evicting objects from a cache pool:
    # ceph osd pool set cache-pool target_max_bytes 1000000
    
    Configuring a cache tier
  5. Enable target_max_objects, which is the maximum number of objects after which a cache-tiering agent starts flushing/evicting objects from a cache pool:
    # ceph osd pool set cache-pool target_max_objects 10000
    
  6. Enable cache_min_flush_age and cache_min_evict_age, which are the time in seconds a cache-tiering agent will take to flush and evict objects from a cache tier to a storage tier:
    # ceph osd pool set cache-pool cache_min_flush_age 300
    # ceph osd pool set cache-pool cache_min_evict_age 300
    
    Configuring a cache tier
  7. Enable cache_target_dirty_ratio, which is the percentage of cache pool containing dirty (modified) objects before the cache-tiering agent flushes them to the storage tier:
    # ceph osd pool set cache-pool cache_target_dirty_ratio .01
    
  8. Enable cache_target_full_ratio, which is the percentage of cache pool containing unmodified objects before the cache-tiering agent flushes them to the storage tier:
    # ceph osd pool set cache-pool cache_target_full_ratio .02
    
  9. Create a temporary file of 500 MB that we will use to write to the EC-pool, which will eventually be written to a cache-pool:
    # dd if=/dev/zero of=/tmp/file1 bs=1M count=500
    

    Tip

    This is an optional step; you can use any other file to test cache pool functionality.

    The following screenshot shows the preceding commands in action:

    Configuring a cache tier

Testing the cache tier

Until now, we created and configured a cache tier. Next, we will test it. As explained earlier, during a client write operation, data seems to be written to regular pools, but actually, it is written on cache-pools first, therefore, clients benefit from the faster I/O. Based on the cache tier policies, data is migrated from a cache pool to a storage pool transparently. In this section, we will test our cache tiering setup by writing and observing the objects on cache and storage tiers:

  1. In the previous section, we created a 500 MB test file named /tmp/file1; we will now put this file to an EC-pool:
    # rados -p EC-pool put object1 /tmp/file1
    
  2. Since an EC-pool is tiered with a cache-pool, file1 should not get written to the EC-pool at the first state. It should get written to the cache-pool. List each pool to get object names. Use the date command to track time and changes:
    # rados -p EC-pool ls
    # rados -p cache-pool ls
    # date
    
    Testing the cache tier
  3. After 300 seconds (as we have configured cache_min_evict_age to 300 seconds), the cache-tiering agent will migrate object1 from the cache-pool to EC-pool; object1 will be removed from the cache-pool:
    # rados -p EC-pool ls
    # rados -p cache-pool ls
    # date
    
    Testing the cache tier

As explained in the preceding output, data is migrated from a cache-pool to an EC-pool after a certain time.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.82.217