HBase administration tools

Here, we will discuss HBase administrating tools that are already available. We will also study the HBase check (hbck) and the HBase health check script a bit more.

hbck – HBase check

The hbck command is used to check/repair HBase. This command finds out inconsistencies in the HBase cluster, if they exist, and gives a formatted output for them. This command/tool checks for region consistency and table integrity problems. It works in two modes:

  • Read-only mode: This only displays inconsistencies if they exist
  • Read-write-repair mode: This reports inconsistencies and tries to repair them

It is good to repair inconsistencies that have lower risk while executing a repair hbck command. These region consistency repairs are localized-single-region repairs, which only modify in-memory data, wrong ZooKeeper data, or patch holes in the metadata table (an inconsistency exists if every possible row key doesn't resolve to exactly one region, and if every region isn't assigned and deployed on exactly one RegionServer and metadata-related issue).

The options to repair region consistencies include:

  • -fixAssignments: This repairs unassigned, incorrectly assigned, or multiple time assigned regions. To fix these problems, we can run the following command:
    hbase hbck –fixAssignments
    
  • -fixMeta: This removes meta rows when corresponding regions are not present in HDFS and adds a new metarow if the regions are present in HDFS and not in META. Use the following command to fix assignment and Meta:
    hbase hbck -fixAssignments -fixMeta
    

There are a few table integrity problems that are of low risk. The first two are: degenerate where startkey == endkey regions and backwards regions where startkey > endkey. These problems are automatically handled by sidelining the data to a temporary directory (/hbck/xxxx). The third low-risk class is HDFS region holes. This can be repaired using the -fixHdfsHoles option to fabricate new empty regions on the file system. If holes are detected, we can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent.

Have a look at the following example:

hbase hbck -repairHoles

Now, let's see the region-related fixes.

We need to run the hbase hbck -details command so that you isolate repair attempts only upon problems that the checks identify, so that we can really understand where the exact problem lies and where the specific problem can be targeted.

Some other repair options are as follows:

  • -fixHdfsOrphans: This is used to adopt a region directory that has missing region metadata
  • -fixHdfsOverlaps: This has the ability to fix overlapping regions
  • -repair: This can be used to repair all the region inconsistencies
  • -maxMerge<n>: This can be used to merge a maximum number of overlapping regions
  • -sidelineBigOverlaps: This sidelines the regions to non-overlapping regions if more than one big regions overlap
  • -maxOverlapsToSideline<n>: This option works by sidelining large overlapping regions, and sidelines a maximum of <n> regions

Some other cases that are to be considered are as follows:

  • Use the following command to fix if Meta is not properly assigned:
    hbase hbck -fixMetaOnly -fixAssignments
    
  • Use the following command if the HBase version file is missing:
    hbase hbck - fixVersionFile
    
  • Use the following command if Root and META are corrupt:
    hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
    
  • Use the following command when an offline split parent occurs:
    hbase hbck -fixSplitParents
    

    Note

    You can check for help with the following command:

    hbase hbck –help
    

HBase health check script

The HBase health check script is available in the example directory of HBase, the source of which we can find at http://svn.apache.org/viewvc/hbase/trunk/hbase-examples/src/main/sh/healthcheck/healthcheck.sh.

The following parameters can be configured in order to automate this script and set the interval of the health check:

hbase.node.health.script.location
hbase.node.health.script.timeout
hbase.node.health.script.frequency
hbase.node.health.failure.threshold

The default frequency is after every 60 seconds, but we can set it accordingly.

The failure threshold defaults to 3, which is the number of times the health check will be tried before being set as a failure of check.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.9.197