Performance implications of join

The join type chosen directly impacts the performance of the join. This is because joins require the shuffling of data between executors to execute the tasks, hence different joins, and even the order of the joins, need to be considered when using join.

The following is a table you could use to refer to when writing Join code:

Join type Performance considerations and tips
inner Inner join requires the left and right tables to have the same column. If you have duplicate or multiple copies of the keys on either the left or right side, the join will quickly blow up into a sort of a Cartesian join, taking a lot longer to complete than if designed correctly to minimize the multiple keys.

cross

Cross Join matches every row from left with every row from right, generating a Cartesian cross product. This is to be used with caution, as this is the worst performant join, to be used in specific use cases only.
outer, full, fullouter Fullouter Join gives all rows in left and right filling in NULL if only in right or left. If used on tables with little in common, can result in very large results and thus slow performance.
leftanti Leftanti Join gives only rows in left based on non-existence on right side. Very good performance, as only one table is fully considered and the other is only checked for the join condition.
left, leftouter Leftouter Join gives all rows in left plus common rows of left and right (inner join). Fills in NULL if not in right. If used on tables with little in common, can result in very large results and thus slow performance.
leftsemi Leftsemi Join gives only rows in left based on existence on right side. Does not include right side values. Very good performance, as only one table is fully considered and other is only checked for the join condition.
right, rightouter Rightouter Join gives all rows in right plus common rows of left and right (inner join). Fills in NULL if not in left. Performance is similar to the leftouter join mentioned previously in this table.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.255.113