Title Page Copyright and Credits Apache Hive Essentials Second Edition Dedication Packt Upsell Why subscribe? PacktPub.com Contributors About the author About the reviewers Packt is searching for authors like you Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Conventions used Get in touch Reviews Overview of Big Data and Hive A short history Introducing big data The relational and NoSQL databases versus Hadoop Batch, real-time, and stream processing Overview of the Hadoop ecosystem Hive overview Summary Setting Up the Hive Environment Installing Hive from Apache Installing Hive from vendors Using Hive in the cloud  Using the Hive command Using the Hive IDE Summary Data Definition and Description Understanding data types Data type conversions Data Definition Language Database Tables Table creation Table description Table cleaning Table alteration Partitions Buckets Views Summary Data Correlation and Scope Project data with SELECT Filtering data with conditions Linking data with JOIN INNER JOIN OUTER JOIN Special joins Combining data with UNION Summary Data Manipulation Data exchanging with LOAD Data exchange with INSERT Data exchange with [EX|IM]PORT Data sorting Functions Function tips for collections Function tips for date and string Virtual column functions Transactions and locks Transactions UPDATE statement DELETE statement MERGE statement Locks Summary Data Aggregation and Sampling Basic aggregation  Enhanced aggregation Grouping sets Rollup and Cube Aggregation condition Window functions Window aggregate functions Window sort functions Window analytics functions Window expression Sampling Random sampling Bucket table sampling Block sampling Summary Performance Considerations Performance utilities EXPLAIN statement ANALYZE statement Logs Design optimization Partition table design Bucket table design Index design Use skewed/temporary tables Data optimization File format Compression Storage optimization Job optimization Local mode JVM reuse Parallel execution Join optimization Common join Map join Bucket map join Sort merge bucket (SMB) join Sort merge bucket map (SMBM) join Skew join Job engine Optimizer Vectorization optimization Cost-based optimization Summary Extensibility Considerations User-defined functions UDF code template UDAF code template UDTF code template Development and deployment HPL/SQL Streaming SerDe Summary Security Considerations Authentication Metastore authentication Hiveserver2 authentication Authorization Legacy mode Storage-based mode SQL standard-based mode Mask and encryption The data-hashing function The data-masking function The data-encryption function Other methods Summary Working with Other Tools The JDBC/ODBC connector NoSQL The Hue/Ambari Hive view HCatalog Oozie Spark Hivemall Summary Other Books You May Enjoy Leave a review - let other readers know what you think