Real-time query with Impala on Hadoop

Impala is marketed as a product that can do real-time queries on Hadoop by its developer, Cloudera. Impala is an open source implementation based on the previously mentioned Google Dremel technology that is available free for anyone to use. Impala is available as a package product that is free to use or can be compiled from its source, which can run queries in memory to make them real time. In some cases, depending on the type of data, if the Parquet file format is used as the input data source, it can expedite the query processing to a multifold speed.

Real-time query subscriptions with Impala

Cloudera provides a Real-time Query (RTQ) subscription as an add-on to a Cloudera Enterprise subscription. You can still use Impala as a free, open source product; however, opting for the RTQ subscription allows you to take advantage of the Cloudera paid service to extend its usability and resilience. By accepting the RTQ subscription, you can not only have access to Cloudera Technical support, but you can also work with the Impala development team to provide ample feedback to shape up the product design and implementation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.121.45