Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Ceph data management

The data management inside a Ceph cluster involves all the components that we have discussed so far. The coordination between these components gives power to Ceph to provide a reliable and robust storage system. Data management starts as soon as a client writes data to a Ceph pool. Once the client writes data to a Ceph pool, data is first written to a primary OSD based on the pool replication size. The primary OSD replicates the same data to its secondary and tertiary OSDs and waits for their acknowledgement. As soon as the secondary and tertiary OSDs complete data writing, they send an acknowledgement signal to the primary OSD, and finally, the primary OSD returns an acknowledgement to the client confirming the write operation completion.

In this way, Ceph consistently stores each client write operation and provides data availability from its replicas in the event of failures. Let's now see how data is stored in a Ceph cluster:

We will first create a test file, a Ceph pool, and set the pool replication to 3 copies:

# echo "Hello Ceph, You are Awesome like MJ" > /tmp/helloceph
# ceph osd pool create HPC_Pool 128 128
# ceph osd pool set HPC_Pool size 3

Put some data in this pool and verify its contents:

# rados -p HPC_Pool put object1 /tmp/helloceph
# rados -p HPC_Pool ls

The file has now been stored in a Ceph pool. As you know, everything in Ceph gets stored in the form of objects, which belong to a placement group, and these placement groups belong to multiple OSDs. Now, let's see this concept practically:
```
# ceph osd map HPC_Pool object1
```
This command will show you OSD maps for object1, which is inside HPC_Pool:
Let's discuss the output of this command:
- osdmap e566: This is the OSD map version ID or OSD epoch 556.
- pool 'HPC_Pool' (10): This is a Ceph pool name and pool ID.
- object 'object1': This is an object name.
- pg 10.bac5debc (10.3c): This is the placement group number, that is, object1, which belongs to PG 10.3c.
- up [0,6,3]: This is the OSD up set that contains osd.0, osd.6, and osd.3. Since the pool has a replication level set to 3, each PG will be stored in three OSDs. This also means that all the OSDs holding PG 10.3c are up. It is the ordered list of OSDs that is responsible for a particular OSD at a particular epoch as per the CRUSH map. This is usually the same as the acting set.
- acting [0,6,3]: osd.0, osd.6, and osd.3 are in the acting set where osd.0 is the primary OSD, osd.6 is the secondary OSD, and osd.3 is the tertiary OSD. The acting set is the ordered list of OSD, which is responsible for a particular OSD.
Check the physical location of each of these OSDs. You will find OSDs 0, 6, and 3 are physically separated on ceph-node1, ceph-node3, and ceph-node2 hosts, respectively.
Now, log in to any of these nodes to check where the real data resides on OSD. You will observe that object1 is stored at PG 10.3c of ceph-node2 , on partition sdb1 which is osd.3; note that these PG ID and OSD ID might differ with your setup:
```
# ssh ceph-node2
# df -h | grep -i ceph-3
# cd /var/lib/ceph/osd/ceph-3/current
# ls -l | grep -i 10.3c
# cd 10.3c_head/
# ls -l
```

In this way, Ceph stores each data object in a replicated manner over different failure domains. This intelligence mechanism is the core of Ceph's data management.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Ceph data management

Create new playlist

Sign In

Sign Up

Ceph data management

Table of Contents for
Ceph data management