Chapter 6. Storage Provisioning with Ceph

In this chapter, we will cover the following topics:

  • Setting up a Ceph block device
  • Setting up the Ceph filesystem
  • Setting up Ceph object storage using a RADOS gateway
  • Configuring S3 and Swift with a Ceph RADOS gateway

Storage provisioning is the primary and most important task of a storage system administrator. It is the process of assigning storage space or capacity to both physical and virtual servers in the form of blocks, files, or objects. A typical computer system and servers come with a limited local storage capacity that is not enough for your data storage needs. Storage solutions such as Ceph provide virtually unlimited storage capacity to these servers, making them capable to store all your data and making sure you do not run out of space.

In addition to providing extra storage, there are numerous benefits of having a centralized storage system.

Ceph can provision storage capacity in a unified way, which includes block storage, filesystem, and object storage. Depending on your use case, you can select one or more storage solutions as shown in the following diagram. Now, let's discuss these storage types in detail and implement them on our test cluster.

Storage Provisioning with Ceph

The RADOS block device

The RADOS block device (RBD)—formerly known as the Ceph block device—provides block-based persistent storage to Ceph clients, which they use as an additional disk. The client has the flexibility to use the disk as they require, either as a raw device or by formatting it with a filesystem followed by mounting it. A RADOS block device makes use of the librbd library and stores blocks of data in a sequential form striped over multiple OSDs in a Ceph cluster. RBD is backed by the RADOS layer of Ceph, and thus, every block device is spread over multiple Ceph nodes, delivering high performance and excellent reliability. RBD is rich with enterprise features such as thin provisioning, dynamically resizable, snapshots, copy-on-write, and caching, among others. The RBD protocol is fully supported with Linux as a mainline kernel driver; it also supports various virtualization platforms such as KVM, Qemu, and libvirt, allowing virtual machines to take advantage of a Ceph block device. All these features make RBD an ideal candidate for cloud platforms such as OpenStack and CloudStack. We will now learn how to create a Ceph block device and make use of it:

  1. To create a Ceph block device, log in to any of the Ceph monitor nodes, or to an admin host that has admin access to a Ceph cluster. You can also create Ceph RBD from any client node that is configured as a Ceph client. For security reasons, you should not store Ceph admin keys on multiple nodes other than Ceph nodes and admin hosts.
  2. The following command will create a RADOS block device named ceph-client1-rbd1 of size 10240 MB:
    # rbd create ceph-client1-rbd1 --size 10240
    
  3. To list rbd images, issue the following command:
    # rbd ls 
    
  4. To check details of an rbd image, use the following command:
    # rbd --image ceph-client1-rbd1 info
    

    Have a look at the following screenshot to see the preceding command in action:

    The RADOS block device
  5. By default, RBD images are created under the rbd pool of the Ceph cluster. You can specify other pools by using the -p parameter with the rbd command. The following command will give you the same output as the last command, but we manually specify the pool name using the -p parameter here. Similarly, you can create RBD images on other pools using the -p parameter:
    # rbd --image ceph-client1-rbd1 info -p rbd
    

Setting up your first Ceph client

Ceph is a storage system; to store your data on a Ceph cluster, you will require a client machine. Once storage space is provisioned from the Ceph cluster, the client maps or mounts a Ceph block or filesystem and allows us to store data on to the Ceph cluster. For object storage, clients have HTTP access to Ceph clusters to store data. A typical production-class Ceph cluster consists of two different networks, frontend network and backend network, also known as public network and cluster network, respectively.

The frontend network is the client network by which Ceph serves data to its clients. All Ceph clients interact with clusters using the frontend network. Clients do not have access to the backend network, and Ceph mainly uses the backend network for its replication and recovery .We will now set up our first Ceph client virtual machine that we will use throughout this book. During the setup process, we will create a new client virtual machine as we did in Chapter 2, Ceph Instant Deployment:

  1. Create a new VirtualBox virtual machine for the Ceph client:
    # VboxManage createvm --name ceph-client1 --ostype RedHat_64 --register
    # VBoxManage modifyvm ceph-client1 --memory 1024 --nic1 nat  --nic2 hostonly --hostonlyadapter2 vboxnet1
    
    # VBoxManage storagectl ceph-client1 --name "IDE Controller" --add ide --controller PIIX4 --hostiocache on --bootable on
    # VBoxManage storageattach ceph-client1 --storagectl "IDE Controller" --type dvddrive --port 0 --device 0 --medium /downloads/CentOS-6.4-x86_64-bin-DVD1.iso 
    # VBoxManage storagectl ceph-client1 --name "SATA Controller" --add sata --controller IntelAHCI --hostiocache on --bootable on
    # VBoxManage createhd --filename OS-ceph-client1.vdi --size 10240
    # VBoxManage storageattach ceph-client1 --storagectl "SATA Controller" --port 0 --device 0 --type hdd --medium OS-ceph-client1.vdi
    # VBoxManage startvm ceph-client1 --type gui
    
  2. Once the virtual machine is created and started, install the CentOS operating system by following the OS installation documentation at https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Installation_Guide/index.html. During the installation process, provide the hostname as ceph-client1.
  3. Once you have successfully installed the operating system, edit the network configuration of the machine as stated in the following steps and restart network services:
    • Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file and add the following:
      ONBOOT=yes
      BOOTPROTO=dhcp
    • Edit the /etc/sysconfig/network-scripts/ifcfg-eth1 file and add the following:
      ONBOOT=yes
      BOOTPROTO=static
      IPADDR=192.168.57.200
      NETMASK=255.255.255.0
    • Edit the /etc/hosts file and add the following:
      192.168.57.101 ceph-node1
      192.168.57.102 ceph-node2
      192.168.57.103 ceph-node3
      192.168.57.200 ceph-client1

Mapping the RADOS block device

Earlier in this chapter, we created an RBD image on a Ceph cluster; in order to use this block device image, we need to map it to the client machine. Let's see how the mapping operation works.

Ceph support has been added to Linux kernel from Version 2.6.32. For client machines that need native access to Ceph block devices and filesystems, it is recommended that they use Linux kernel release 2.6.34 and later.

Check the Linux kernel version and RBD support using the modprobe command. Since this client runs on an older release of Linux kernel, it does not support Ceph natively.

# uname –r
# modprobe rbd

Have a look at the following screenshot:

Mapping the RADOS block device

In order to add support for Ceph, we need to upgrade the Linux kernel version.

Note

Note that this kernel's upgradation is just for demonstration purpose required only for this chapter. In your production environment, you should plan prior to the kernel upgrade. Please take the decision wisely before performing these steps to your production environment.

  1. Install ELRepo rpm as follows:
    # rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
    
  2. Install a new kernel using the following command:
    # yum --enablerepo=elrepo-kernel install kernel-ml
    
  3. Edit /etc/grub.conf, update default = 0, and then gracefully reboot the machine.

Once the machine is rebooted, check the Linux kernel version and its RBD support, as we did earlier:

Mapping the RADOS block device

To grant clients' permission to access the Ceph cluster, we need to add the keyring and Ceph configuration file to them. Client and Ceph cluster authentication will be based on the keyring. The admin user of Ceph has full access to the Ceph cluster, so for security reasons, you should not unnecessarily distribute the admin keyrings to other hosts if they do not need it. As per best practice, you should create separate users with limited capabilities to access the Ceph cluster, and use these keyrings to access RBD. In the upcoming chapters, we will discuss more on Ceph users and keyrings. As of now, we will make use of the admin user keyring.

Install Ceph binaries on ceph-node1 and push ceph.conf and ceph.admin.keyring to it:

# ceph-deploy install ceph-client1
# ceph-deploy admin ceph-client1

Once the Ceph configuration file and admin keyring are placed in ceph-client1 node, you can query the Ceph cluster for RBD images:

Mapping the RADOS block device

Map the RBD image ceph-client1-rbd1 to a ceph-client1 machine. Since RBD is now natively supported by Linux kernel, you can execute the following command from the ceph-client1 machine to map RBD:

# rbd map --image ceph-client1-rbd1

Alternatively, you can use the following command to specify the pool name of the RBD image and can achieve the same results. In our case, the pool name is rbd, as explained earlier in this chapter:

# rbd map rbd/ceph-client1-rbd1

You can find out the operating system device name used for this mapping, as follows:

# rbd showmapped

The following screenshot shows this command in action:

Mapping the RADOS block device

Once RBD is mapped to OS, we should develop a filesystem on it to make it usable. It will now be used as an additional disk or as a block device:

# fdisk -l /dev/rbd0
# mkfs.xfs /dev/rbd0
# mkdir /mnt/ceph-vol1
# mount /dev/rbd0 /mnt/ceph-vol1

Have a look at the following screenshot:

Mapping the RADOS block device

Put some data on Ceph RBD:

# dd if=/dev/zero of=/mnt/ceph-vol1/file1 count=100 bs=1M
Mapping the RADOS block device

Resizing Ceph RBD

Ceph supports thin-provisioned block devices, that is, the physical storage space will not get occupied until you really begin storing data to the block device. Ceph RADOS block devices are very flexible; you can increase or decrease the size of RBD on the fly from the Ceph storage end. However, the underlying filesystem should support resizing. Advance filesystems such as XFS, Btrfs, EXT, and ZFS support filesystem resizing to a certain extent. Follow filesystem-specific documentation to know more on resizing.

To increase or decrease the Ceph RBD image size, use the --size <New_Size_in_MB> parameter with the rbd resize command; this will set the new size for an RBD image. The original size of the RBD image ceph-client1-rbd1 was 10 GB; the following command will increase its size to 20 GB:

# rbd resize rbd/ceph-client1-rbd1 --size 20480
Resizing Ceph RBD

Now that the Ceph RBD image has been resized, you should check if the new size is being accepted by the kernel as well by executing the following command:

# xfs_growfs -d /mnt/ceph-vol1
Resizing Ceph RBD

From the client machine, grow the filesystem so that it can make use of increased storage space. From the client perspective, capacity resize is a feature of an OS filesystem; you should read the filesystem documentation before resizing any partition. An XFS filesystem supports online resizing.

Ceph RBD snapshots

Ceph extends full support to snapshots, which are point-in-time, read-only copies of an RBD image. You can preserve the state of a Ceph RBD image by creating snapshots and restoring them to get the original data.

To test the snapshot functionality of Ceph RBD, let's create a file on RBD:

# echo "Hello Ceph This is snapshot test" > /mnt/ceph-vol1/snaptest_file
Ceph RBD snapshots

Now our filesystem has two files. Let's create a snapshot of Ceph RBD using the rbd snap create <pool-name>/<image-name>@<snap-name>syntax, as follows:

# rbd snap create rbd/ceph-client1-rbd1@snap1

To list a snapshot of an image, use the rbd snap ls <pool-name>/<image-name> syntax, as follows:

# rbd snap ls rbd/ceph-client1-rbd1
Ceph RBD snapshots

To test the snapshot restore functionality of Ceph RBD, let's delete files from the filesystem:

# cd /mnt/ceph-vol1
# rm -f file1 snaptest_file
Ceph RBD snapshots

We will now restore Ceph RBD snapshots to get the files that we deleted in the last step back.

Note

The rollback operation will overwrite the current version of an RBD image and its data with the snapshot version. You should perform this operation carefully.

The syntax for this is rbd snap rollback <pool-name>/<image-name>@<snap-name>. The following is the command:

# rbd snap rollback rbd/ceph-client1-rbd1@snap1

Once the snapshot rollback operation is completed, remount the Ceph RBD filesystem to refresh the state of the filesystem. You should able to get your deleted files back.

# umount /mnt/ceph-vol1
# mount /dev/rbd0 /mnt/ceph-vol1
Ceph RBD snapshots

When you no longer need snapshots, you can remove a specific snapshot using the rbd snap rm <pool-name>/<image-name>@<snap-name> syntax. Deleting the snapshot will not delete your current data on the Ceph RBD image:

# rbd snap rm rbd/ceph-client1-rbd1@snap1

If you have multiple snapshots of an RBD image and you wish to delete all the snapshots in a single command, you can make use of the purge subcommand.

The syntax for it is rbd snap purge <pool-name>/<image-name>. The following is the command to delete all snapshots with a single command:

# rbd snap purge rbd/ceph-client1-rbd1

The rbd rm <RBD_image_name> -p <Image_pool_name> syntax is used to remove an RBD image, as follows:

# rbd rm ceph-client1-rbd1 -p rbd

Ceph RBD clones

The Ceph storage cluster is capable of creating Copy-on-write (COW) clones from RBD snapshots. This is also known as snapshot layering in Ceph. This layering feature of Ceph allows clients to create multiple instant clones of Ceph RBD. This feature is extremely useful for Cloud and virtualization platforms such as OpenStack, CloudStack, and Qemu/KVM. These platforms usually protect Ceph RBD images containing OS/VM images in the form of a snapshot. Later, this snapshot is cloned multiple times to spin new virtual machines/instances. Snapshots are read only, but COW clones are fully writable; this feature of Ceph provides a greater flexibility and is extremely useful for cloud platforms. The following diagram shows relationship between RADOS block device, RBD snapshot, and COW snapshot clone. In the upcoming chapters of this book, we will discover more on COW clones to spawn OpenStack instances.

Ceph RBD clones

Every cloned image (child image) stores references of its parent snapshot to read image data. Hence, the parent snapshot should be protected before it can be used for cloning. At the time of data writing on COW-cloned images, it stores new data references to itself. COW-cloned images are as good as RBD.

They are quite flexible, similar to RBD, that is, they are writable, resizable, can create new snapshots, and can be cloned further.

The type of the RBD image defines the feature it supports. In Ceph, an RBD image is of two types: format-1 and format-2. The RBD snapshot feature is available on both format-1 and format-2 RBD images. However, the layering feature, that is, the COW cloning feature is available only for RBD images with format-2. Format-1 is the default RBD image format.

For demonstration purposes, we will first create a format-2 RBD image, create its snapshot, protect its snapshot, and finally, create COW clones out of it:

  1. Create a format-2 RBD image:
    # rbd create ceph-client1-rbd2 --size 10240 --image-format 2
    
    Ceph RBD clones
  2. Create a snapshot of this RBD image:
    # rbd snap create rbd/ceph-client1-rbd2@snapshot_for_clone
    
  3. To create a COW clone, protect the snapshot. This is an important step; we should protect the snapshot because if the snapshot gets deleted, all the attached COW clones will be destroyed:
    # rbd snap protect rbd/ceph-client1-rbd2@snapshot_for_clone
    
  4. Cloning the snapshot requires the parent pool, RBD image, and snapshot names. For a child, it requires the pool and RBD image names.

    The syntax for this is rbd clone <pool-name>/<parent-image>@<snap-name> <pool-name>/<child-image-name>. The command to be used is as follows:

    # rbd clone rbd/ceph-client1-rbd2@snapshot_for_clone rbd/ceph-client1-rbd3
    
  5. Creating a clone is a quick process. Once it's completed, check the new image information. You will notice that its parent pool, image, and snapshot information are displayed.
    # rbd --pool rbd --image ceph-client1-rbd3 info
    
    Ceph RBD clones

At this point, you have a cloned RBD image, which is dependent upon its parent image snapshot. To make the cloned RBD image independent of its parent, we need to flatten the image, which involves copying the data from a parent snapshot to a child image. The time it takes to complete the flattening process depends upon the size of data present in the parent snapshot. Once the flattening process is completed, there is no dependency between the cloned RBD image and its parent snapshot. Let's perform this flattening process practically:

  1. To initiate the flattening process, use the following command:
    # rbd flatten rbd/ceph-client1-rbd3
    

    After the completion of the flattening process, if you check the image information, you will notice that the parent image/snapshot name is released, which makes the clone image independent.

    Ceph RBD clones
  2. You can also remove the parent image snapshot if you no longer require it. Before removing the snapshot, you first have to unprotect it using the following command:
    # rbd snap unprotect rbd/ceph-client1-rbd2@snapshot_for_clone
    
  3. Once the snapshot is unprotected, you can remove it using the following command:
    # rbd snap rm rbd/ceph-client1-rbd2@snapshot_for_clone
    
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.116.51