In the last seven chapters, I described the various traits of Impala, and I believe that you have learned those details as well. Now it is time to finish the book by adding a few more details, which will help you understand the true potential of Impala.
The technology behind Impala is revolutionary and inspired by a Google research project named Dremel. Dremel is a scalable ad hoc query-based analysis system for read-only nested data. Dremel-based implementations can run aggregation queries over trillions of rows in seconds by combining multilevel executing trees and columnar data layout. It does not use MapReduce as the core; instead it complements MapReduce. Impala is considered to be a native Massive Parallel Processing query engine running on Apache Hadoop. Depending on the type of query and configuration, Impala excels in data processing performance over traditional database applications on Hadoop, such as Hive, and processing frameworks, such as MapReduce, due to the following key reasons:
You can learn more on Google Dremel by referring to a research paper at the following URL:
18.226.251.70