Preface

We are a part of a digital world that is producing an enormous amount of data each second. The data growth is unimaginable and it's predicted that humankind will possess 40 Zettabytes of data by 2020. Well that's not too much, but how about 2050? Should we guesstimate a Yottabyte? The obvious question arises: do we have any way to store this gigantic data, or are we prepared for the future? To me, Ceph is the ray of hope and the technology that can be a possible answer to the data storage needs of the next decade. Ceph is the future of storage.

It's a great saying that "Software is eating the world". Well that's true. However, from another angle, software is the feasible way to go for various computing needs, such as computing weather, networking, storage, datacenters, and burgers, ummm…well, not burgers currently. As you already know, the idea behind a software-defined solution is to build all the intelligence in software itself and use commodity hardware to solve your greatest problem. And I think, this software-defined approach should be the answer to the future's computing problems.

Ceph is a true open source, software-defined storage solution, purposely built to handle unprecedented data growth with linear performance improvement. It provides a unified storage experience for file, object, and block storage interfaces from the same system. The beauty of Ceph is its distributed, scalable nature, and performance; reliability and robustness come along with these attributes. And furthermore, it is pocket friendly, that is, economical, providing you more value for each dollar you spent.

Ceph is the next big thing that has happened to the storage industry. Its enterprise class features such as scalability, reliability, erasure coding, cache tiering and counting, has led to its maturity that has improved significantly in the last few years. To name a few, there are organizations such as CERN, Yahoo, and DreamHost where multi-PB Ceph cluster is being deployed and is running successfully.

It's been a while since block and object interfaces of Ceph have been introduced and they are now fully developed. Until last year, CephFS was the only component that was lacking production readiness. This year, my bet is on CephFS as it's going to be production-ready in Ceph Jewel. I can't wait to see CephFS production adoption stories. There are a few more areas where Ceph is gaining popularity, such as AFA (All Flash Array), database workloads, storage for containers, and Hyper Converge Infrastructure. Well, Ceph has just begun; the best is yet to come.

In this book, we will take a deep dive to understand Ceph—covering components and architecture including its working. The Ceph Cookbook focuses on hands-on knowledge by providing you with step-by-step guidance with the help of recipes. Right from the first chapter, you will gain practical experience of Ceph by following the recipes. With each chapter, you will learn and play around with interesting concepts of Ceph. I hope, by the end of this book, you will feel competent regarding Ceph, both conceptually as well as practically, and you will be able to operate your Ceph storage infrastructure with confidence and success.

Happy Learning

Karan Singh

What this book covers

Chapter 1, Ceph – Introduction and Beyond, covers an introduction to Ceph, gradually moving towards RAID and its challenges, and Ceph architectural overview. Finally, we will go through Ceph installation and configuration.

Chapter 2, Working with Ceph Block Device, covers an introduction to Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots, clones, as well as storage options for OpenStack cinder, glance and nova.

Chapter 3, Working with Ceph Object Storage, deep dives into Ceph object storage including RGW standard and federated setup, S3, and OpenStack Swift access. Finally, we will set up file sync and service using ownCloud.

Chapter 4, Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kernel, Fuse, and NFS-Ganesha. You will also learn how to access CephFS via Ceph-Dokan Windows client.

Chapter 5, Monitoring Ceph Clusters using Calamari, includes Ceph monitoring via CLI, an introduction to Calamari, and setting up of Calamari server and clients. We will also cover monitoring of Ceph cluster via Calamari GUI as well as troubleshooting Calamari.

Chapter 6, Operating and Managing a Ceph Cluster, covers Ceph service management and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructure.

Chapter 7, Ceph under the Hood, explores Ceph CRUSH map, understanding the internals of CRUSH map, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and the understanding of Ceph PG. Finally, we created the specifics required for specific hardware.

Chapter 8, Production Planning and Performance Tuning for Ceph, covers the planning of Cluster production deployment and HW and SW planning for Ceph. This chapter also includes Ceph recommendation and performance tuning. Finally, this chapter covers erasure coding and cache tiering.

Chapter 9, The Virtual Storage Manager for Ceph, is dedicated to Virtual Storage Manager (VSM), covering its introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster using VSM and manage it.

Chapter 10, More on Ceph, the final chapter of the book, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.62.168