Like any database or filesystem, HBase can run into inconsistencies between what it believes its metadata looks like and what its filesystem actually looks like. Of course, the inverse of the previous statement can be true as well. Before getting into debugging HBase inconsistencies, it is important to understand the layout of HBase’s metadata master table known as hbase:meta
and the how HBase is laid out on HDFS. Looking at the meta table name hbase:meta
, the hbase
before the :
indicates the namespace the table lives in, and after the :
is the name of the table, which is meta
. Namespaces are used for logical grouping of similar tables, typically utilized in multitenant environments. Out of the box, two namespaces are used: default
and hbase
. default
is where all tables without a namespace specified are created, and hbase
is used for HBase internal tables. For right now, we are going to focus on hbase:meta
. HBase’s meta table is used to store important pieces of information about the regions in the HBase tables. Here is a sample output of an HBase instance with one user table named odell
:
hbase(main):002:0> describe 'hbase:meta' DESCRIPTION 'hbase:meta', {TABLE_ATTRIBUTES => {IS_META => 'true', coprocessor$1 => '|org.apache.hadoop.hbase.coprocessor.MultiRowMutation Endpoint|536870911|'}, {NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => 'FOREVER', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '8192', IN_MEMORY => 'true', BLOCKCACHE => 'true'}
The important pieces of information to glean from the this output are as follows:
IS_META => true
This means that the table you are describing is the meta
table. Hopefully this comes as no surprise!
NAME => info
There is only one column family in the meta
table called info
. We look deeper into what is stored in this column family next.
IN_MEMORY => true, BLOCKCACHE => true
The metatable and HBase indexes are both stored in block cache, and it’s important to never set the block cache below the necessary amount (usually 0.1 or 10% of the heap is sufficient for this).
Looking at the following block of data, it is clear that meta
is not very fun to read natively, but it is a necessary evil when troubleshooting HBase inconsistencies:
hbase(main):010:0> scan 'hbase:meta', {STARTROW => 'odell,,'} ROW COLUMN+CELL odell,,1412793323534.aa18c6b576bd8fe3eaf71382475bade8. column=info:regioninfo, timestamp=1412793324037, value={ENCODED => aa18c6b576b... column=info:seqnumDuringOpen, timestamp=1412793324138, value=x00x00x00x00... column=info:server, timestamp=1412793324138, value=odell-test-5.ent.cloudera.c... column=info:serverstartcode, timestamp=1412793324138, value=1410381620515 odell,ccc,1412793397646.3eadbb7dcbfeee47e8751b356853b17e. column=info:regioninfo, timestamp=1412793398180, value={ENCODED => 3eadbb7dcbf... column=info:seqnumDuringOpen, timestamp=1412793398398, value=x00x00x00x00... column=info:server, timestamp=1412793398398, value=odell-test-3.ent.cloudera.c... column=info:serverstartcode, timestamp=1412793398398, value=1410381620376 ...truncated
The first column of the output is the row key:
odell,,1412793323534.aa18c6b576bd8fe3eaf71382475bade8.
and:
odell,ccc,1412793397646.3eadbb7dcbfeee47e8751b356853b17e.
The meta row key is broken down into table name, start key, timestamp, encoded region name, and .
(yes, the .
is necessary). When troubleshooting, the most important aspects are table name, encoded region name, and start key because it is important these match up expectedly. The next column are all of the key value pairs for this row—in this case, there is one column family named info
and four column qualifiers named regioninfo
, seqnumDuringOpen
, server
, and serverstartcode
. There are a few main values to make particular note of when looking at the meta table for a particular region:
info:regioninfo
Contains the encoded region name, the row key, the start key, and the stop key.
info:seqnumDuringOpen
Used for later HBase features such as shadow regions, but is currently not important for troubleshooting.
info:server
Contains the information about the RegionServer the region is assigned to (this will become quite useful when troubleshooting unassigned regions).
info:serverstartcode
Contains the start times for the particular region in the RegionServer.
Looking at the layout of meta
can be a good indicator of what the HBase region structure should look like, but there are times when meta
can misleading or incorrect. In these cases, HDFS is the source of truth for HBase. It is just as important to be able to read the HDFS file layout of HBase as it is the meta table. We are now going to explore the HDFS layout on disk:
-bash-4.1$ hadoop fs -ls /hbase Found 9 items drwxr-xr-x - hbase hbase 0 2014-10-08 12:14 /hbase/.hbase-snapshot drwxr-xr-x - hbase hbase 0 2014-08-26 10:36 /hbase/.migration drwxr-xr-x - hbase hbase 0 2014-09-30 06:48 /hbase/.tmp drwxr-xr-x - hbase hbase 0 2014-09-10 13:40 /hbase/WALs drwxr-xr-x - hbase hbase 0 2014-10-10 21:33 /hbase/archive drwxr-xr-x - hbase hbase 0 2014-08-28 08:49 /hbase/data -rw-r--r-- 3 hbase hbase 42 2014-08-28 08:53 /hbase/hbase.id -rw-r--r-- 3 hbase hbase 7 2014-08-28 08:49 /hbase/hbase.version drwxr-xr-x - hbase hbase 0 2014-10-14 06:45 /hbase/oldWALs
The first set of directories that begin with a .
are all internal HBase directories that do not contain any data. The .hbase-snapshot directory is fairly self-explanatory—it contains all of the current HBase snapshots. Next is the .migration directory, which is utilized in upgrades from one HBase version to the next. The .tmp directory is where files are temporarily created. Also, FSUtils and HBCK will take advantage of this space. For example, the hbase.version file is created here and then moved to /hbase once fully written. HBCK will use the space when merging regions or any other operations that involve changing the FS layout. The WALs directory will contain all of the currently active WALs that have not yet been rolled or need to be split during a restart. The archive directory is related directly to the .hbase-snapshots and is reserved for snapshot use only. The archive directly holds the HFiles that are being protected by the HBase snapshot, rather than being deleted. The hbase.id file contains the unique ID of the cluster. The hbase.version file holds a string representation of the current version of HBase. The oldWALs directory is used in direct correlation with HBase replication. Any WALs that still need to be replayed to the destination cluster are written here rather than deleted when rolled. This backup typically happens whenever there is communication issues between the source and destination cluster. For the exercise of troubleshooting inconsistencies, we will be focused on the data directory. The data directory is aptly named—it contains the data for HBase. Let’s take a deeper look at the the directory structure for data:
-bash-4.1$ hadoop fs -ls /hbase/data Found 2 items drwxr-xr-x - hbase hbase 0 2014-10-08 11:31 /hbase/data/default drwxr-xr-x - hbase hbase 0 2014-08-28 08:53 /hbase/data/hbase
The next layer is where the namespaces for the HBase are contained. In the preceding example, there are only two namespaces—default
and hbase
:
-bash-4.1$ hadoop fs -ls /hbase/data/default Found 1 items drwxr-xr-x - hbase hbase 0 2014-10-08 11:40 /hbase/data/default/odell -bash-4.1$ hadoop fs -ls /hbase/data/hbase Found 2 items drwxr-xr-x - hbase hbase 0 2014-08-28 08:53 /hbase/data/hbase/meta drwxr-xr-x - hbase hbase 0 2014-08-28 08:53 /hbase/.../namespace
The next level down will show the tables’ names. As in the preceding code snippet, we have a table called odell
in our default namespace, and we have two tables, meta
, and namespace
, in the hbase
namespace. We are going to focus on what odell
looks like going forward:
-bash-4.1$ hadoop fs -ls /hbase/data/default/odell Found 5 items drwxr-xr-x 2014-10-08 11:31 /hbase/data/default/odell/.tabledesc drwxr-xr-x 2014-10-08 11:31 /hbase/data/default/odell/.tmp drwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/3eadbb7dcbfeee47e875... drwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/7450bb77ac287b9e77ad... drwxr-xr-x 2014-10-08 11:35 /hbase/data/default/odell/aa18c6b576bd8fe3eaf71...
The .tabledesc directory contains a file typically named .tableinfo.000000xxxx where x is a count for the number of tables. The tableinfo file contains a listing of the same information presented when running a describe
from the HBase shell. This includes information such as meta, data block encoding type, bloomfilter used, version count, compression, and so on. It is important to maintain the tableinfo file when attempting to duplicate tables using distcp (we highly recommend using snapshots instead). The .tmp directory is used to write the tableinfo file which is then moved to the .tabledesc when it is complete. Next, we have the encoded region name; glancing back at the meta output you will notice that the encoded region names should match up to the output of meta especially in the info:regioninfo
under ENCODED
. Under each encoded region directory is:
-bash-4.1$ hadoop fs -ls -R /hbase/data/default/odell/3ead... -rwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/3ead.../.regioninfo drwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/3ead.../.tmp drwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/3ead.../cf1 -rwxr-xr-x 2014-10-08 11:36 /hbase/data/default/odell/3ead.../cf1/5cadc83fc35d...
The .regioninfo file contains information about the region’s <INFORMATION>
. The .tmp directory at the individual region level is used for rewriting storefiles during major compactions. Finally there will be a directory for each column family in the table, which will contain the storefiles if any data has been written to disk in that column family.
Now that we have an understanding of the HBase internals on meta and the filesystem, let’s look at what HBase looks like logically when everything is intact. In the preceding example we have one table named odell
with three regions covering “–aaa, aaa–ccc, ccc–eee, eee–”. It always helps to be able to visualize that data:
Region 1 | Region 2 | Region 3 | … | Region 24 | Region 25 | Region 26 |
---|---|---|---|---|---|---|
“ |
aaa |
bbb |
… |
xxx |
yyy |
zzz |
aaa |
bbb |
ccc |
… |
yyy |
zzz |
” |
Table 18-1 is a logical diagram of a fictitious HBase table covering the alphabet. Every set of keys is assigned to individual regions, starting and ending with absolute quotation marks which will catch anything before or after the current set of row keys.
Earlier versions of HBase were prone to inconsistencies through bad splits, failed merges, and incomplete region cleanup operations. The later versions of HBase are quite solid, and rarely do we run into inconsistencies. But as with life, nothing in this world is guaranteed, and software can have faults. It is always best to be prepared for anything. The go-to tool for repairing inconsistencies in HBase is known as the HBCK tool. This tool is capable of repairing most any issues you will encounter with HBase. The HBCK tool can be executed by by running hbase hbck
from the CLI:
-bash-4.1$ sudo -u hbase hbase hbck 14/10/15 05:23:24 INFO Client environment:zookeeper.version=3.4.5-cdh5.1.2--1,... 14/10/15 05:23:24 INFO Client environment:host.name=odell-test-1.ent.cloudera.com 14/10/15 05:23:24 INFO Client environment:java.version=1.7.0_55 14/10/15 05:23:24 INFO Client environment:java.vendor=Oracle Corporation 14/10/15 05:23:24 INFO Client environment:java.home=/usr/java/jdk1.7.0_55-clou... ...truncated... Summary: hbase:meta is okay. Number of regions: 1 Deployed on: odell-test-5.ent.cloudera.com,60020,1410381620515 odell is okay. Number of regions: 3 Deployed on: odell-test-3.ent.cloudera.com,60020,1410381620376 hbase:namespace is okay. Number of regions: 1 Deployed on: odell-test-4.ent.cloudera.com,60020,1410381620086 0 inconsistencies detected. Status: OK
The preceding code outlines a healthy HBase instance with all of the regions assigned, META is correct, all of the region info is correct in HDFS, and all of the regions are currently consistent. If everything is running as expected, there should be 0 inconsistencies detected and status of OK. There are a few ways that HBase can become corrupt. We will take a deeper look at some of the more common scenarios:
Bad region assignments
Corrupt META
HDFS holes
HDFS orphaned regions
Region overlaps
When dealing with inconsistencies, it is very common for false positives to be present and cause the situation to look more dire than it really is. For example, a corrupt META can cause numerous HDFS overlaps or holes to show up, when the underlying FS is actually perfect. The primary flag to run in HBCK with only the -repair
flag. This flag will execute every repair in a row command:
-fixAssignments -fixMeta -fixHdfsHoles -fixHdfsOrphans -fixHdfsOverlaps -fixVersionFile -sidelineBigOverlaps -fixReferenceFiles -fixTableLocks
This is great when you are working with an experimental or development instance, but might not be ideal when dealing with production or pre-production instances. One of the primary reasons to be careful with executing just the -repair
flag is the -sidelineBigOverlaps
flag. If there are overly large overlaps, HBase will sideline regions outside of HBase, and they will have to be bulk loaded back into the correct region assignments. Without a full understanding of every flag’s implication, it is possible to make the issue worse than it is. It is recommended to take a pragmatic approach and start with the less impactful flags.
Before you start running any HBCK, make sure you are either logging to an external file or your terminal is logging all commands and terminal outputs
The first two flags we typically prefer to run are -fixAssignments
and -FixMeta
. The -fixAssignments
flag repairs unassigned regions, incorrectly assigned regions, or regions with multiple assignments. HBase uses HDFS as the underlying source of truth for the correct layout of META. The -fixMeta
flag removes meta rows when corresponding regions are not present in HDFS and adds new meta rows if the regions are present in HDFS while not in META. In HBase, the region assignments are controlled through the Assignment Manager. The Assignment Manager keeps the current state of HBase in memory, if the region assignments were out of sync with HBase and META, it is safe to assume they are out of sync in the Assignment Manager. The fastest way to update the Assignment Manager to the correct values provided by HBCK is to rolling restart your HBase Master nodes. After restarting the HBase Master nodes, it is time to run HBCK again.
If after rerunning HBCK the end result is not “0 inconsistencies detected,” then it is time to use some heavier-handed commands to correct the outstanding issues. The three other major issues that could still be occurring are HDFS holes, HDFS overlaps, and HDFS orphans.
If running the -FixMeta
and the -FixAssignments
flag, we would recommend contacting your friendly neighborhood Hadoop vendor for more detailed instructions. If, on the other hand, you are handling this yourself, we would recommend using the -repair
flag at this point. It is important to note that numerous passes may need to be run. We recommend running the -repair
flag in a cycle similar to this:
-bash-4.1$ sudo -u hbase hbase hbck -bash-4.1$ sudo -u hbase hbase hbck -repair -bash-4.1$ sudo -u hbase hbase hbck -bash-4.1$ sudo -u hbase hbase hbck -repair -bash-4.1$ sudo -u hbase hbase hbck -bash-4.1$ sudo -u hbase hbase hbck -repair -bash-4.1$ sudo -u hbase hbase hbck
If you have run through this set of commands and are still seeing inconsistencies, you may need to start running through individual commands depending on the output of the last HBCK command. Again, at this point, we cannot stress enough the importance of contacting your Hadoop vendor or the Apache mailing lists—there are experts available who can help with situations like this. In lieu of that, here is a list of other commands that be found in HBCK:
-fixHdfsHoles
Try to fix region holes in HDFS.
-fixHdfsOrphans
Try to fix region directories with no .regioninfo file in HDFS.
-fixTableOrphans
Try to fix table directories with no .tableinfo file in HDFS (online mode only).
-fixHdfsOverlaps
Try to fix region overlaps in HDFS.
-fixVersionFile
Try to fix missing hbase.version file in HDFS.
-sidelineBigOverlaps
When fixing region overlaps, allow to sideline big overlaps.
-fixReferenceFiles
Try to offline lingering reference store files.
-fixEmptyMetaCells
Try to fix hbase:meta
entries not referencing any region (empty REGIONINFO_QUALIFIER
rows).
-maxMerge <n>
When fixing region overlaps, allow at most <n>
regions to merge (n
=5 by default).
-maxOverlapsToSideline <n>
When fixing region overlaps, allow at most <n>
regions to sideline per group (n
=2 by default).
3.137.186.178