In this chapter we will cover:
When working with Apache Hive, Pig, and MapReduce, you may find yourself having to perform certain tasks frequently. The recipes in this chapter provide solutions for executing several very common routines.
You will find that these tools let you solve the same problems in numerous different ways. Deciding on the right implementation can be a difficult task. The recipes presented here were designed for coding efficiency and clarity.
Hive and Pig provide a clean abstraction layer between your data flow and meaningful queries, and the complex MapReduce workflows they compile to. You can leverage the power of MapReduce for scalable queries without having to think about the underlying MapReduce semantics. Both tools handle the decomposition and building of your expressions into the proper MapReduce sequences. Hive lets you build analytics and manage data using a declarative, SQL-like dialect known as HiveQL. Pig operations are written in Pig Latin and take a more imperative form.
3.129.216.7