Preface

Apache Hadoop is an open source distributed computing technology that assists users in processing large volumes of data with relative ease, helping them to generate tremendous insights into their data. Cloudera, with their open source distribution of Hadoop, has made data analytics on Big Data possible and accessible to anyone interested.

This book fully prepares you to be a Hadoop administrator, with special emphasis on Cloudera. It provides step-by-step instructions on setting up and managing a robust Hadoop cluster running Cloudera's Distribution Including Apache Hadoop (CDH).

This book starts out by giving you a brief introduction to Apache Hadoop and Cloudera. You will then move on to learn about all the tools and techniques needed to set up and manage a production-standard Hadoop cluster using CDH and Cloudera Manager.

In this book, you will learn the Hadoop architecture by understanding the different features of HDFS and walking through the entire flow of a MapReduce process. With this understanding, you will start exploring the different applications packaged into CDH and will follow a step-by-step guide to set up HDFS High Availability (HA) and HDFS Federation.

You will learn to use Cloudera Manager, Cloudera's cluster management application. Using Cloudera Manager, you will walk through the steps to configure security using Kerberos, learn about events and alerts, and also configure backups.

What this book covers

Chapter 1, Getting Started with Apache Hadoop, introduces you to Apache Hadoop and walks you through the different Apache Hadoop daemons.

Chapter 2, HDFS and MapReduce, provides you with an in-depth understanding of HDFS and MapReduce.

Chapter 3, Cloudera's Distribution Including Apache Hadoop, introduces you to Cloudera's Apache Hadoop Distribution and walks you through its installation steps.

Chapter 4, Exploring HDFS Federation and Its High Availability, introduces you to the steps to configure a federated HDFS and also provides step-by-step instructions to set up HDFS High Availability.

Chapter 5, Using Cloudera Manager, introduces you to Cloudera Manager, Cloudera's cluster management application and walks you through the steps to install Cloudera Manager.

Chapter 6, Implementing Security Using Kerberos, walks you through the steps to secure your cluster by configuring Kerberos.

Chapter 7, Managing an Apache Hadoop Cluster, introduces you to all the cluster management capabilities available within Cloudera Manager.

Chapter 8, Cluster Monitoring Using Events and Alerts, introduces you to the different events and alerts available within Cloudera Manager that will assist you in monitoring your cluster effectively.

Chapter 9, Configuring Backups, walks you through the steps to configure backups and snapshots using Cloudera Manager.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.171.162