Sort merge bucket (SMB) join

SMB is a join performed on bucket tables that have the same sorted, bucket, and join condition columns. It reads data from both bucket tables and performs common joins (map and reduce triggered) on the bucket tables. We need to enable the following properties to use SMB:

> SET hive.input.format=
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > SET hive.auto.convert.sortmerge.join=true; > SET hive.optimize.bucketmapjoin=true; > SET hive.optimize.bucketmapjoin.sortedmerge=true; > SET hive.auto.convert.sortmerge.join.noconditionaltask=true;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.26.138