Map join

Map join is used when one of the join tables is small enough to fit in the memory, so it is fast but limited by the table size. Since Hive v0.7.0, it has been able to convert map join automatically with the following settings:

> SET hive.auto.convert.join=true; -- default true after v0.11.0
> SET hive.mapjoin.smalltable.filesize=600000000; -- default 25m
> SET hive.auto.convert.join.noconditionaltask=true; -- default value above is true so map join hint is not needed
> SET hive.auto.convert.join.noconditionaltask.size=10000000; -- default value above controls the size of table to fit in memory

Once join auto-convert is enabled, Hive will automatically check whether the smaller table file size is bigger than the value specified by hive.mapjoin.smalltable.filesize, and then it will convert the join to a common join. If the file size is smaller than this threshold, it will try to convert the common join into a map join. Once auto-convert join is enabled, there is no need to provide the map join hints in the query.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.125.100