When HBase receives a delete request for some data, it does not delete it immediately. The data that needs to be deleted is marked as a tombstone using the tombstone marker. It is a predicate deletion, which is a feature supported by LSM-trees on which HBase is based. This is done because HFile is immutable, and deletion of this is not available inside HFile on HDFS. One of the major compactions takes place when the marked record or data is discarded, and a new HFile is created without the marked data.
The following figure shows us how the overall reading and writing is done in HBase:
As we can see in the preceding figure, whenever a write request is sent, it is first written to WAL and then to MemStore, and when MemStore and WAL reach the threshold, it is flushed to the disk file for persistence. Also, when client needs to read some data, it queries the .META.
table and then contacts the specific RegionServer; if data is found, it is returned to client, else the read activity continues from WAL to MemStore, and then to HFile, to read data where it is found.
3.129.26.22