Understanding Ceph Filesystem and MDS

The Ceph Filesystem offers the POSIX-compliant distributed filesystem of any size that uses Ceph RADOS to store its data. To implement Ceph filesystem, you need a running Ceph storage cluster and at least one Ceph Metadata Serer (MDS) to manage its metadata and keep it separated from data, which helps in reducing complexity and improves reliability. The following diagram depicts the architectural view of CephFS and its interfaces:

Understanding Ceph Filesystem and MDS

libcephfs libraries play an important role in supporting its multiple client implementations. It has the native Linux kernel driver support and thus clients can use native filesystem mounting, for example, using the mount command. It has tight integration with SAMBA and support for CIFS and SMB. CephFS extends its support to Filesystem in USErspace (FUSE) using cephfuse modules. It also allows direct application interaction with the RADOS cluster using libcephfs libraries. CephFS is gaining popularity as a replacement for Hadoop HDFS. Previous versions of HDFS only supported the single name node, which impacts its scalability and creates a single point of failure; however, this has been changed in current versions of HDFS. Unlike HDFS, CephFS can be implemented over multiple MDS in an active-active state, thus making it highly scalable, high performing, and with no single point of failure.

Ceph MDS stands for Metadata Server and is required only for the Ceph Filesystem (CephFS); other storage methods' block- and object-based storage do not require MDS services. Ceph MDS operates as a daemon, which allows the client to mount a POSIX filesystem of any size. MDS does not serve any data directly to the client; data serving is done only by OSD. MDS provides a shared coherent filesystem with a smart caching layer, hence drastically reducing reads and writes. It extends its benefits towards dynamic subtree partitioning and a single MDS for a piece of metadata. It is dynamic in nature; daemons can join and leave, and the takeover to failed nodes is quick.

MDS does not store local data, which is quite useful in some scenarios. If an MDS daemon dies, we can start it up again on any system that has cluster access. The Metadata server's daemons are configured as active or passive. The primary MDS node becomes active and the rest will go into "standby". In the event of primary MDS failure, the second node takes charge and is promoted to active. For even faster recovery, you can specify that a standby node should follow one of your active nodes, which will keep the same data in memory to pre-populate the cache.

CephFS is not production ready at the moment, as it lacks robust fsck check/repair functions, multiple active MDS, and snapshots. Its development is going at a very fast pace, and we can expect it to be production ready starting Ceph Jewel. For your no critical workloads you can consider using CephFS with single MDS and no snapshots.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.138.195