Impala High Availability

Impala runs on DataNodes and takes advantage of any High Availability (HA) configuration available to DataNodes. Impala uses data stored in HDFS, which is the distributed data storage layer in Hadoop, shared between NameNode and DataNodes. Hadoop does provide the NameNode High Availability configuration; if you would like to learn more about it, I would recommend looking at the Hadoop documentation.

To make Impala High Availability, the best option is to take advantage of the HDFS HA feature. As an Impala cluster administrator, you can upgrade a Hive metastore to use HDFS HA features. Because Impala depends on Hive metastore, in the event the primary metastore is not available, it will instantly be available on the other HDFS HA node without interrupting any significant downtime.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.81.33