Horizontal scaling with automatic sharding of HBase tables

Automatics sharding is a nice feature in HBase. Auto sharding is the capability where the HBase tables are dynamically divided into smaller parts and distributed across the region servers when they become too large.

This capability to share the data and distribute parts of it to different regions helps HBase to scale Horizontally. Regions contain a subset of the table's data. This data is a contiguous sorted set of rows that are stored together.

As you can imagine, when you start with HBase and you start putting data in HBase tables, there would only be a single region. At some point, a region becomes too large and is split into two tables containing partial sets of rows. This is depicted in the following diagram:

You need to understand the difference between the concept of region and region servers. A Region is conceptually a partial HBase table that contains a set of contiguous table rows. A region server is a physical node or the slave part of the master/slave architecture of HBase that helps HBase scale horizontally. A region server can host multiple regions. Thus, each region server is responsible for serving a set of regions, and one region (that is, the range of rows) can be served only by one region server.  

HBase consists of a single HBase master node (HMaster) and several slaves, that is, region servers. Whenever a client sends a write request, HMaster receives the request and forwards it to the corresponding region server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.34.198