Batch mode adaptive joins

Logical joins are implemented through three physical join operators in SQL Server: Nested Loop, Hash Match, and Merge Join. For Merge Join, inputs must be sorted in the same manner, thus in most cases, you can see either a Nested Loop or Hash Match Join operator. The decision of which one to use is made during compilation. A significant part of the decision is estimating how many records will be processed in both join inputs. If the estimation is wrong, you can expect the wrong operator. Inappropriate join operators in the execution plan can lead to serious performance issues. Another case where the choice between these two operators could be a problem is stored procedures with parameter sniffing. In this case, the plan is created and optimized for the first execution and this operator will be used later as a cached plan, even for parameters where another join operator would be better.

Prior to SQL Server 2017, the execution plan was created based on the estimation at compile time and the parameters used by the first call in case of stored procedures. Thus, all physical operators used in execution plans are chosen at compile time. In SQL Server 2017, the decision which operators the query and is used in the query plan is a bit postponed. Before the decision is made, SQL Server looks a bit into data and then decides dynamically which join operator to use. To support dynamically switching between Nested Loop Join and Hash Match Join operators, SQL Server 2017 introduces a new operator called Adaptive Join.

Adaptive Join contains both operators and can switch between them after the first input has been scanned and the number of items is known. It starts as a Hash Match Join operator; it performs the first step in a Hash Match algorithm, scanning the first input. It also defines a threshold; if the number of input rows is less than the threshold, it changes to a Nested Loop Join operator.

To demonstrate the usage of the new operator, create a sample stored procedure:

USE WideWorldImporters;
GO
CREATE OR ALTER PROCEDURE dbo.GetSomeOrderDeatils
@UnitPrice DECIMAL(18,2)
AS
SELECT o.OrderID, o.OrderDate, ol.OrderLineID, ol.Quantity, ol.UnitPrice
FROM Sales.OrderLines ol
INNER JOIN Sales.Orders o ON ol.OrderID = o.OrderID
WHERE ol.UnitPrice = @UnitPrice;
GO

Now, invoke the stored procedure with the parameter 112:

EXEC dbo.GetSomeOrderDeatils 112;

The execution plan for this invocation is shown as follows:

Adaptive Join operator

You can see the new Adaptive Join operator in the execution plan. The branch 1 represents a columnstore index scan and it is used to provide rows for the hash join build phase (the Adaptive Join operator starts as a Hash Match Join). In the Property Window of the Adaptive Join operator, you can see the property Adaptive Threshold Rows. This property defines the number of rows in the hash build phase that are required for switching to the Nested Loop Join. In the figure Adaptive Join operator, this value is 97.3. That means if the actual number of rows is less than 97, Nested Loop Join will be used, otherwise it will continue with Hash Match Join. Therefore, you can see two additional branches—each implementing one of the two mentioned operators. How do you know which one is used? You can see it in the previously mentioned property window in the value for the property Actual Join Type. In this case, the number of rows is 1,004 and it exceeds the threshold, therefore the Hash Match Join operator is implemented. You can also see this in the branches 2 and 3. Branch 3, which corresponds to the Nested Loop Join operator for the Actual Number of Rows, has a value of 0. So, it is clear that branches 1 and 2 are executed.

Now call the same procedure with a different parameter to see how Adaptive Join switches to the Nested Loop Join operator:

EXEC dbo.GetSomeOrderDeatils 1;

The execution plan is shown in the following screenshot:

Adaptive Join operator

This time, the actual number of rows is 32 and, since this is under the threshold, branch 3 is executed and the property Actual Join Type has a NestedLoops value.

The Adaptive Threshold has been calculated based on the parameter used for the first call (112). Its value is 97.309. If you were to call it with a different parameter, the value would be different. So, this is not solution for parameter sniffing. It is still possible to have an inappropriate plan, but at least you can have two different operators in the same plan.

To see what's going on under the hood, when Adaptive Join is used, you need to enable the trace flag 9415, as in the following code example:

DBCC TRACEON (9415);
EXEC dbo.GetSomeOrderDeatils 112;

The following screenshot shows both operators, Hash Match Join and Nested Loops Join, and how they are actually implemented:

Adaptive Join operator under the hood

As you can see, the Adaptive Join operator uses the extended Concatenation and Table Spool operators.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.133.233