Appendix B. Metadata Table

The metadata table contains a row for every tablet in Accumulo. Tablets are uniquely described by the ID of their table and the last row in the range assigned to the tablet, or end row. Table B-1 describes the columns that can appear in a tablet’s row in the metadata table, and Table B-2 shows some sample entries from a real metadata table.

In addition to tablet entries, there is a section of the metadata table that records file deletion entries. There is also a section for files that are in the process of being bulk-imported into Accumulo, to assist the garbage collector in not deleting these files prematurely. More about file deletion can be found in “Garbage Collector”.

Table B-1. Metadata table description
Row Column family Column qualifier Value

table id ; tablet end row

file

regular data file name

size in bytes , number of keys

table id ; tablet end row

future

tablet server session id

tserver IP : port

table id ; tablet end row

last

tablet server session id

tserver IP : port

table id ; tablet end row

loc

tablet server session id

tserver IP : port

table id ; tablet end row

log

server / log file name

log set | table id

table id ; tablet end row

scan

file currently being scanned

table id ; tablet end row

srv

compact

compaction id

table id ; tablet end row

srv

dir

tablet directory

table id ; tablet end row

srv

flush

flush id

table id ; tablet end row

srv

lock

zookeeper lock location

table id ; tablet end row

srv

time

M or L followed by latest time

table id ; tablet end row

~tab

~pr

0x01 followed by previous tablet’s end row

Row ID

The row ID for a tablet contains the table ID and the tablet end row separated by a semicolon. For the last tablet in a table, there is no end row. The row for that tablet is the table ID followed by <.

Rows starting with ~del are for deletion entries and rows starting with ~blip are for files that are in the process of being bulk loaded. These entries also contain the name of the file marked for deletion or bulk loading.

There are also entries for problems with loading resources. If the problem involves the metadata table, the information about the problem is written directly to ZooKeeper, but problems with other tablets are written to the metadata table. These entries have row ID beginning with ~err and also containing the table name. The column family is either FILE_READ, FILE_WRITE, or TABLET_LOAD, indicating the type of problem, and the column qualifier is the resource name, which is either a filename or a tablet key extent (prev row and end row). The value contains additional information such as the time the problem occurred, the server, and the exception if available.

File Column Family

This column family contains information about a tablet’s files. The column qualifier is the name of the file and the value contains information about the file, its size in bytes, and number of keys. Under some conditions these values are estimates. For example, when a tablet is split, the two resulting tablets’ file entries will each be assumed to contain about half the bytes and number of keys of the original tablet’s files.

The first letter of the filename (the actual file name, not including its path) indicates what type of operation created the file:

F

Minor compaction

C

Major compaction

A

Full major compaction

M

Merging minor compaction

I

Bulk import

Scan Column Family

This column family is used to ensure that files are not deleted while they are being scanned. The column qualifier is the name of a file currently being scanned. The garbage collector takes this information into account when determining which files are still in use and which can be safely deleted.

future, last, and loc Column Families

These column families contain information about where a tablet has been assigned. The future column contains the current assignment. The loc column contains the current assignment once the tablet has been successfully loaded by the assigned tablet server. The last column is the last assignment, used to try to reassign a tablet to the same server to improve data locality.

The column qualifier is the tablet server session ID, and the value is the tablet server location, its IP address, and port. Each tablet server process has a unique session ID, so if the tablet server process is restarted on a machine Accumulo will be able to distinguish between tablets assigned to it before and after it was restarted.

log Column Family

This column family contains information about a tablet’s write-ahead logfiles. The column qualifier is the server name and the logfile name separated by a slash. The value is the log set and table ID separated by a pipe. In 1.5.0 and later, the log set is the same as the logfile name.

srv Column Family

The dir column qualifier has the tablet’s main directory as its value. The tablet can use files outside of this directory, but new files will be created in the directory.

The compact column qualifier has the most recent compaction ID as its value. The flush column qualifier has the most recent flush ID as its value. These IDs are used to determine whether requested flushes or compactions have successfully completed for all relevant tablets.

The lock column qualifier contains the ZooKeeper lock location for a tablet server that is attempting to write to the metadata table. There is a constraint on the metadata table that only accepts writes from tablet servers with currently held ZooKeeper locks.

The time column qualifier stores the timestamp of the most recently written data to a tablet. It is preceded by an M indicating that the timestamp is in milliseconds since the epoch, or an L indicating that the timestamp is in logical time (essentially a one-up counter).

~tab:~pr Column

This column contains the end row of the previous tablet, which helps Accumulo keep track of its metadata. The value is 0x01 followed by previous tablet’s end row. For the first tablet in a table, there is no previous tablet, so the value is set to 0x00.

Other Columns

There are a few additional metadata entry types that are ephemeral, such as those written in the process of a tablet split operation. These include a ~tab:oldprevrow and ~tab:splitRatio for split operations; chopped:chopped for merge operations; loaded for bulk import operations; and !cloned for table clone operations.

Table B-2. A sample of metadata table contents
Row Column family:Column qualifier Value

!0;!0<

srv:dir

/root_tablet

!0;!0<

~tab:~pr

x00

!0;~

file:/table_info/A0001c8q.rf

965,28

!0;~

last:1409c5a89030283

127.0.0.1:9997

!0;~

loc:1409c5a89030283

127.0.0.1:9997

!0;~

srv:compact

10892

!0;~

srv:dir

/table_info

!0;~

srv:flush

10892

!0;~

srv:lock

tservers/127.0.0.1:9997/zlock-0000000000$1409c5a89030283

!0;~

srv:time

L3523

!0;~

~tab:~pr

x01!0<

!0<

last:1409c5a89030283

127.0.0.1:9997

!0<

loc:1409c5a89030283

127.0.0.1:9997

!0<

srv:compact

10892

!0<

srv:dir

/default_tablet

!0<

srv:flush

10892

!0<

srv:lock

tservers/127.0.0.1:9997/zlock-0000000000$1409c5a89030283

!0<

srv:time

L4504

!0<

~tab:~pr

x01~

1<

file:/default_tablet/C0000dj7.rf

12617985,527400

1<

file:/default_tablet/C0000mcw.rf

18999363,790313

1<

file:/default_tablet/C00013vg.rf

25227499,1035563

1<

file:/default_tablet/C000191y.rf

7476173,305642

1<

file:/default_tablet/C0001alv.rf

2239839,91671

1<

file:/default_tablet/C0001boe.rf

1543104,63045

1<

file:/default_tablet/C0001bzk.rf

452015,18523

1<

file:/default_tablet/C0001c5h.rf

146692,5864

1<

file:/default_tablet/F0001c45.rf

95096,3852

1<

file:/default_tablet/F0001c6j.rf

43762,1750

1<

file:/default_tablet/F0001c7w.rf

55019,2206

1<

last:1409c5a89030283

127.0.0.1:9997

1<

loc:1409c5a89030283

127.0.0.1:9997

1<

log:127.0.0.1+9997/30d1970a- 3db5-49fc-82d6-8adde36c9453

127.0.0.1+9997/30d1970a- 3db5-49fc-82d6-8adde36c9453:4

1<

srv:compact

0

1<

srv:dir

/default_tablet

1<

srv:flush

0

1<

srv:lock

tservers/127.0.0.1:9997/zlock-0000000000$1409c5a89030283

1<

srv:time

M1380306759481

1<

~tab:~pr

x00

3<

file:/default_tablet/F0000005.rf

186,1

3<

last:1409c5a89030283

127.0.0.1:9997

3<

loc:1409c5a89030283

127.0.0.1:9997

3<

srv:dir

/default_tablet

3<

srv:flush

1

3<

srv:lock

tservers/127.0.0.1:9997/zlock-0000000000$1409c5a89030283

3<

srv:time

M1377020908127

3<

~tab:~pr

x00

~del/!0/table_info/A0001c8l.rf

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.238.226