Here, we will discuss HBase administrating tools that are already available. We will also study the HBase check (hbck
) and the HBase health check script a bit more.
The hbck
command is used to check/repair HBase. This command finds out inconsistencies in the HBase cluster, if they exist, and gives a formatted output for them. This command/tool checks for region consistency and table integrity problems. It works in two modes:
It is good to repair inconsistencies that have lower risk while executing a repair hbck
command. These region consistency repairs are localized-single-region repairs, which only modify in-memory data, wrong ZooKeeper data, or patch holes in the metadata table (an inconsistency exists if every possible row key doesn't resolve to exactly one region, and if every region isn't assigned and deployed on exactly one RegionServer and metadata-related issue).
The options to repair region consistencies include:
-fixAssignments
: This repairs unassigned, incorrectly assigned, or multiple time assigned regions. To fix these problems, we can run the following command:hbase hbck –fixAssignments
-fixMeta
: This removes meta rows when corresponding regions are not present in HDFS and adds a new metarow if the regions are present in HDFS and not in META
. Use the following command to fix assignment and Meta
:hbase hbck -fixAssignments -fixMeta
There are a few table integrity problems that are of low risk. The first two are: degenerate where startkey == endkey
regions and backwards regions where startkey > endkey
. These problems are automatically handled by sidelining the data to a temporary directory (/hbck/xxxx
). The third low-risk class is HDFS region holes. This can be repaired using the -fixHdfsHoles
option to fabricate new empty regions on the file system. If holes are detected, we can use -fixHdfsHoles
and should include -fixMeta
and -fixAssignments
to make the new region consistent.
Have a look at the following example:
hbase hbck -repairHoles
Now, let's see the region-related fixes.
We need to run the hbase hbck -details
command so that you isolate repair attempts only upon problems that the checks identify, so that we can really understand where the exact problem lies and where the specific problem can be targeted.
Some other repair options are as follows:
-fixHdfsOrphans
: This is used to adopt a region directory that has missing region metadata-fixHdfsOverlaps
: This has the ability to fix overlapping regions-repair
: This can be used to repair all the region inconsistencies-maxMerge<n>
: This can be used to merge a maximum number of overlapping regions-sidelineBigOverlaps
: This sidelines the regions to non-overlapping regions if more than one big regions overlap-maxOverlapsToSideline<n>
: This option works by sidelining large overlapping regions, and sidelines a maximum of <n>
regionsSome other cases that are to be considered are as follows:
Meta
is not properly assigned:hbase hbck -fixMetaOnly -fixAssignments
hbase hbck - fixVersionFile
Root
and META
are corrupt:hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
hbase hbck -fixSplitParents
The HBase health check script is available in the example directory of HBase, the source of which we can find at http://svn.apache.org/viewvc/hbase/trunk/hbase-examples/src/main/sh/healthcheck/healthcheck.sh.
The following parameters can be configured in order to automate this script and set the interval of the health check:
hbase.node.health.script.location hbase.node.health.script.timeout hbase.node.health.script.frequency hbase.node.health.failure.threshold
The default frequency is after every 60 seconds, but we can set it accordingly.
The failure threshold defaults to 3
, which is the number of times the health check will be tried before being set as a failure of check.
18.118.9.197