© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
G. FritcheySQL Server 2022 Query Performance Tuninghttps://doi.org/10.1007/978-1-4842-8891-7_21

21. Intelligent Query Processing

Grant Fritchey1  
(1)
Grafton, MA, USA
 
As you have seen throughout the book, query tuning isn’t easy. Microsoft has recognized this and has introduced a large number of automatic performance enhancements into SQL Server over the last three versions. These are all mechanisms to attempt to deal with common query performance issues without forcing people to rewrite their code or restructure their databases. There are many ways that Microsoft can use the internals of the engine to deal with what would have been in the past problematic code. In this chapter, I’ll be covering
  • Adaptive query processing

  • Approximate query processing

  • Table variable deferred compilation

  • Scalar UDF inlining

I’ve already covered other aspects of some of the Intelligent Query Processing including optimized plan forcing and Parameter Sensitive Plan Optimization in other chapters. I also covered adaptive joins in the chapter on columnstore indexes.

As wonderful as this new technology is, it won’t fix bad code choices or poor structures. Even with all these new, better, methodologies for query execution, standard tuning will still have to take place. However, even a well-tuned system will see benefits from various aspects of Intelligent Query Processing.

Adaptive Query Processing

Adaptive query processing is Microsoft’s attempt to deal with what has in the past been best handled by changes to code and structure. Some options within SQL Server aren’t always the best for query performance such as multi-statement table-valued functions. Adaptive query processing attempts to deal with these through several processes:
  • Adaptive joins (which were already covered)

  • Interleaved execution

  • Query processing feedback

  • Optimized plan forcing (covered in Chapter 6)

Interleaved Execution

I have a standing recommendation to avoid using multi-statement table-valued functions and nothing I’m about to tell you about changes that recommendation. However, there is still going to be situations where you have these functions in place and your performance could suffer.

Starting in SQL Server 2017, however, Microsoft has changed the way that multi-statement functions behave. The optimizer can identify that this type of function is being executed. SQL Server can allow those functions to execute and then, based on the rows being processed, can adjust the row counts on the fly in order to get a better execution plan.

To see this in action, I’m going to create three different functions as shown in Listing 21-1.
CREATE OR ALTER FUNCTION dbo.SalesInfo
()
RETURNS @return_variable TABLE
(
    SalesOrderID INT,
    OrderDate DATETIME,
    SalesPersonID INT,
    PurchaseOrderNumber dbo.OrderNumber,
    AccountNumber dbo.AccountNumber,
    ShippingCity NVARCHAR(30)
)
AS
BEGIN;
    INSERT INTO @return_variable
    (
        SalesOrderID,
        OrderDate,
        SalesPersonID,
        PurchaseOrderNumber,
        AccountNumber,
        ShippingCity
    )
    SELECT soh.SalesOrderID,
           soh.OrderDate,
           soh.SalesPersonID,
           soh.PurchaseOrderNumber,
           soh.AccountNumber,
           A.City
    FROM Sales.SalesOrderHeader AS soh
        JOIN Person.ADDRESS AS A
            ON soh.ShipToAddressID = A.AddressID;
    RETURN;
END;
GO
CREATE OR ALTER FUNCTION dbo.SalesDetails
()
RETURNS @return_variable TABLE
(
    SalesOrderID INT,
    SalesOrderDetailID INT,
    OrderQty SMALLINT,
    UnitPrice MONEY
)
AS
BEGIN;
    INSERT INTO @return_variable
    (
        SalesOrderID,
        SalesOrderDetailID,
        OrderQty,
        UnitPrice
    )
    SELECT sod.SalesOrderID,
           sod.SalesOrderDetailID,
           sod.OrderQty,
           sod.UnitPrice
    FROM Sales.SalesOrderDetail AS sod;
    RETURN;
END;
GO
CREATE OR ALTER FUNCTION dbo.CombinedSalesInfo
()
RETURNS @return_variable TABLE
(
    SalesPersonID INT,
    ShippingCity NVARCHAR(30),
    OrderDate DATETIME,
    PurchaseOrderNumber dbo.OrderNumber,
    AccountNumber dbo.AccountNumber,
    OrderQty SMALLINT,
    UnitPrice MONEY
)
AS
BEGIN;
    INSERT INTO @return_variable
    (
        SalesPersonID,
        ShippingCity,
        OrderDate,
        PurchaseOrderNumber,
        AccountNumber,
        OrderQty,
        UnitPrice
    )
    SELECT si.SalesPersonID,
           si.ShippingCity,
           si.OrderDate,
           si.PurchaseOrderNumber,
           si.AccountNumber,
           sd.OrderQty,
           sd.UnitPrice
    FROM dbo.SalesInfo() AS si
        JOIN dbo.SalesDetails() AS sd
            ON si.SalesOrderID = sd.SalesOrderID;
    RETURN;
END;
GO
Listing 21-1

Multi-statement user-defined table-valued functions

This example is a very common antipattern (code smell) that can be found all over. It’s basically an attempt to make SQL Server into more of an object-oriented programming environment. Define one function to retrieve a set of data, a second to retrieve another, then a third that calls the other functions as if they were tables.

In older versions of SQL Server, prior to 2014, the optimizer assumed these tables returned a single row, regardless of how much data is actually returned. The newer cardinality estimation engine, introduced in SQL Server 2014, assumes 100 rows, if the compatibility level is set to 140 or greater.

In order to see the new interleaved execution in action, first, I’m going to use Listing 21-2 to capture the query metrics after disabling interleaved execution.
ALTER DATABASE SCOPED CONFIGURATION SET INTERLEAVED_EXECUTION_TVF = OFF;
GO
SELECT csi.OrderDate,
       csi.PurchaseOrderNumber,
       csi.AccountNumber,
       csi.OrderQty,
       csi.UnitPrice,
       sp.SalesQuota
FROM dbo.CombinedSalesInfo() AS csi
    JOIN Sales.SalesPerson AS sp
        ON csi.SalesPersonID = sp.BusinessEntityID
WHERE csi.SalesPersonID = 277
      AND csi.ShippingCity = 'Odessa';
GO
ALTER DATABASE SCOPED CONFIGURATION SET INTERLEAVED_EXECUTION_TVF = ON;
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
GO
SELECT csi.OrderDate,
       csi.PurchaseOrderNumber,
       csi.AccountNumber,
       csi.OrderQty,
       csi.UnitPrice,
       sp.SalesQuota
FROM dbo.CombinedSalesInfo() AS csi
    JOIN Sales.SalesPerson AS sp
        ON csi.SalesPersonID = sp.BusinessEntityID
WHERE csi.SalesPersonID = 277
      AND csi.ShippingCity = 'Odessa';
GO
Listing 21-2

Executing the functions both ways

Figure 21-1 shows the two execution plans for both queries in Listing 21-2.

A window of the two different strategies for carrying out both query 1, query cost, relative to the branch at 1, and query 2, query cost, relative to the branch at 99 percent.

Figure 21-1

Execution plan for a non-interleaved execution and an interleaved execution

The overall shape of the execution plans is the same. However, there are very distinct differences. For example, since these queries were run as part of a single batch, each plan is measured against the other. The top plan is estimated at only 1% of the cost of the execution, while the second plan is 99%. Why is that?

In both plans, the Sequence operator runs the subtrees attached to it in order, top to bottom. That means the first thing executed is the Tabled Valued Function, CombinedSalesInfo. That function is used to supply data to the Table Scan operation lower in the plan. That Scan operation shows us data movement and, for our purposes, the estimated number of rows. Figure 21-2 shows the properties of that operator from the first plan.

A table has two columns and fifteen rows. The headers are actual execution mode and row, and 15 row entries.

Figure 21-2

Properties from the non-interleaved plan

At the top, you can see the actual number of rows as 148. At the bottom, you can see that the estimates were for 100 rows total and 3.16228 for all executions. It’s that disparity in the estimated number of rows, 3.16228 vs. 148, that frequently causes performance headaches when using table-valued functions.

Figure 21-3 shows the same properties from the second plan, which executed in an interleaved fashion.

A table about misc has two columns and fifteen rows for execution properties. The headers are actual execution mode and row, and 15 row entries.

Figure 21-3

Properties from an interleaved execution

Of course, the same number of rows was returned. The interesting parts are at the bottom. Instead of 100 rows, it’s clear that we’re dealing with larger data sets at 121,317. Also, instead of 3.16228, we’re seeing 18.663. While this isn’t perfect as an estimate, it’s much better, and that’s the point.

The differences in execution times between these are trivial. The non-interleaved execution was approximately 785ms with 371,338 reads. The interleaved execution was 725ms with 349,584 reads. The improvement was small, but it is there.

We can improve this considerably if we restructure the code. I’ll still use a multi-statement function, but I’ll eliminate the internal functions as shown in Listing 21-3.
CREATE OR ALTER FUNCTION dbo.AllSalesInfo
(
    @SalesPersonID INT,
    @ShippingCity VARCHAR(50)
)
RETURNS @return_variable TABLE
(
    SalesPersonID INT,
    ShippingCity NVARCHAR(30),
    OrderDate DATETIME,
    PurchaseOrderNumber dbo.OrderNumber,
    AccountNumber dbo.AccountNumber,
    OrderQty SMALLINT,
    UnitPrice MONEY
)
AS
BEGIN;
    INSERT INTO @return_variable
    (
        SalesPersonID,
        ShippingCity,
        OrderDate,
        PurchaseOrderNumber,
        AccountNumber,
        OrderQty,
        UnitPrice
    )
    SELECT soh.SalesPersonID,
           A.City,
           soh.OrderDate,
           soh.PurchaseOrderNumber,
           soh.AccountNumber,
           sod.OrderQty,
           sod.UnitPrice
    FROM Sales.SalesOrderHeader AS soh
        JOIN Person.ADDRESS AS A
            ON A.AddressID = soh.ShipToAddressID
        JOIN Sales.SalesOrderDetail AS sod
            ON sod.SalesOrderID = soh.SalesOrderID
    WHERE soh.SalesPersonID = @SalesPersonID
          AND A.City = @ShippingCity;
    RETURN;
END;
GO
Listing 21-3

An improved multi-statement function

We can now rewrite our query to execute this function as shown in Listing 21-4.
SELECT asi.OrderDate,
       asi.PurchaseOrderNumber,
       asi.AccountNumber,
       asi.OrderQty,
       asi.UnitPrice,
       sp.SalesQuota
FROM dbo.AllSalesInfo(277, 'Odessa') AS asi
    JOIN Sales.SalesPerson AS sp
        ON asi.SalesPersonID = sp.BusinessEntityID;
Listing 21-4

Using parameters instead of a WHERE clause

By changing the code so that we’re pushing the parameters into the function instead of using a WHERE clause, the interleaved execution can actually get even more accurate estimates. We can see it in the execution plan in Figure 21-4.

A diagram begins from the right with clustered index seek at 8%, 0.000 seconds, 17 of 17 at 100, and table scan at 13, 0.000 seconds, 148 of 148 at 100, and table valued function at 1, 0.000 seconds, 0 of 148 at 0 to sequence cost at 1, 0.000 seconds, 148 of 148 at 100 percent, ends with the selection at 0 percent cost.

Figure 21-4

A new execution plan with better estimates

The estimated number of rows and the actual number of rows perfectly match. Further, the plan has changed from using a Loops Join to a Hash Match since that will be more effective for the amount of data involved.

Performance went down to 7.8ms with 1,120 reads, an enormous improvement. However, it’s worth noting that the non-interleaved performance improved as well. It was 8.1ms with 1,414 reads. So we’re still only talking about .3ms improvement here. As before, faster is faster, and no code changes are required to get the improved performance since interleaved execution is just a part of the engine now.

One more thing to keep in mind when talking about interleaved execution, especially using parameters like this. The plan created is based on the parameters supplied and the data they’ll return. That plan is stored in cache. To a degree, this can act like parameter sniffing if you have a wildly varying data set. This is why you have the ability to turn off interleaved execution if you need to. You can also turn it off through a query hint.

Query Processing Feedback

The optimizer makes a number of decisions based on calculations against the statistics and cardinality of the data in the objects involved in a query. Depending on the accuracy of those calculations, performance can be good or bad. Starting with Memory Grants in SQL Server 2017, the optimizer and the query engine can now use feedback mechanisms to adjust a number of behaviors as queries execute in order to get better performance. The following mechanisms use feedback to change query behavior dynamically:
  • Memory grants

  • Cardinality estimates

  • Degree of parallelism

  • Feedback persistence

Memory Grants

When introduced, memory grant feedback worked only for batch mode processing. In SQL Server 2019, row mode memory grant feedback was also introduced. Part of the calculations from the optimizer is the amount of memory that will need to be allocated to execute a query. If not enough memory gets allocated, then the excess data gets written out to disk through tempdb in a process called a spill. If too much memory is allocated, the query is simply using more than it should. Either way, both these scenarios lead to performance issues.

SQL Server can now adjust the memory allocation after analyzing how memory was used during the execution of the query. Adjustments can be made, up or down, to ensure better performance and better memory management.

To observe this in action, we’ll first set up an Extended Events session that lets us capture the memory grant feedback process (Listing 21-5).
CREATE EVENT SESSION MemoryGrant
ON SERVER
    ADD EVENT sqlserver.memory_grant_feedback_loop_disabled
    (WHERE (sqlserver.database_name = N'AdventureWorks')),
    ADD EVENT sqlserver.memory_grant_updated_by_feedback
    (WHERE (sqlserver.database_name = N'AdventureWorks')),
    ADD EVENT sqlserver.sql_batch_completed
    (WHERE (sqlserver.database_name = N'AdventureWorks'))
WITH
(
    TRACK_CAUSALITY = ON
);
Listing 21-5

Creating an Extended Events session for memory grant feedback

For this session, in the event a query is suffering from parameter sensitivity, also known as parameter sniffing, memory grant feedback could be all over the map. The engine can recognize this, and the memory_grant_feedback_loop_disabled event will fire for that query. When a query experiences changes in its memory allocation due to feedback, the next event, memory_grant_updated_by_feedback, will return information about the feedback process. Finally, I have the sql_batch_completed event so I can see the query associated with the other events, and I have it all tied together through Causality Tracking (explained in Chapter 3).

To see this in action, I’m going to run a query against the MakeBigAdventure tables we created earlier in the book (available here: http://dataeducation.com/thinking-big-adventure/) as shown in Listing 21-6.
CREATE OR ALTER PROCEDURE dbo.CostCheck
(@Cost MONEY)
AS
SELECT p.NAME,
       AVG(th.Quantity),
       AVG(th.ActualCost)
FROM dbo.bigTransactionHistory AS th
    JOIN dbo.bigProduct AS p
        ON p.ProductID = th.ProductID
WHERE th.ActualCost = @Cost
GROUP BY p.NAME;
Listing 21-6

Creating the CostCheck procedure

Executing this procedure with a value to @Cost = 0 results in events in Extended Events, as shown in Figure 21-5.

Table 1 has four columns and two rows; the highlighted entries are under the first row. Table 2 has two columns and 13 rows, the headers are field and value, and 13 row entries.

Figure 21-5

Results of memory grant feedback

A few points are illustrated here. First, by looking at the sequence from Causality Tracking, we can tell that the memory grant feedback occurs before the query finishes executing. We can see, right at the bottom, the amount of memory granted was 75,712, but the memory used was 11,392, a large overallocation. Anything where the allocated memory is more than twice the used memory is considered an overallocation.

However, if you execute the procedure a second time with the same value, you’ll find that no memory_grant_updated_by_feedback event occurs. This is because the memory grant has been adjusted and is now stored in cache with the plan.

If you execute the procedure with a different value, say 117.702, which returns only one row (as opposed to over 9,000), the memory grant is not readjusted. This is because the difference is small enough it doesn’t require a readjustment.

We can also see this in the execution plan properties of the first operator, within the MemoryGrantInfo property set, as shown in Figure 21-6.

A table of memory grant info has 2 columns and 11 row entries. Highlighted is row entry 4 which reads is memory grant feedback adjusted, yes stable.

Figure 21-6

Evidence of memory grant feedback within an execution plan

You can see that the optimizer has adjusted the memory allocation based on feedback, and based on the executions, it considers this to be stable, meaning it won’t be adjusting it more. Other values can be as follows:
  • YesAdjusting: Memory grant feedback has been applied, and it may change in future executions.

  • NoFirstExecution: Memory grant feedback is not generally applied during the first compile and execution.

  • NoAccurateGrant: There has been no spill, and more than half the allocated memory is in use.

  • NoFeedbackDisabled: There has been thrash; memory adjusts up and down multiple times, probably from parameter sniffing or statistics issues; and memory feedback is disabled.

For this behavior to be automatic, you have to set the compatibility mode to 150 or greater. You can disable, or enable, memory grant feedback using Listing 21-7.
ALTER DATABASE SCOPED CONFIGURATION SET ROW_MODE_MEMORY_GRANT_FEEDBACK = OFF; --or ON
Listing 21-7

Disabling memory grant feedback

Memory grant feedback is completely cache dependent. If a plan with feedback is removed from cache, when it is recompiled, it will have to go through the feedback process again. However, it is possible to store this information in Query Store. I’ll discuss that in a separate section on feedback persistence later in the chapter.

This example was a batch mode execution, but the row mode processing is exactly the same in both behavior and how to monitor it using extended events.

Finally, prior to SQL Server 2022, memory grant feedback was subject to some thrash, meaning if a query needed more memory, it was granted. Then, if the next execution required less, it was cut back again. This can lead to poor performance. So in SQL Server 2022, the memory grant is adjusted as a percentile over time, not as a raw number. That then arrives at a better place for performance. This behavior is on by default, and there’s nothing you have to do to enable it. If you choose to disable it, you can alter the database scoped configuration using the MEMORY_GRANT_FEEDBACK_PERCENTILE = OFF command.

Cardinality Estimates

Starting with SQL Server 2022, the cardinality estimates can benefit from a feedback process. The compatibility mode for the database has to be 160 or greater, and Query Store must be enabled. The reason for this is the feedback mechanism uses Query Store hints (described in Chapter 6) to adjust how cardinality is estimated.

We can observe the cardinality estimation (CE) feedback in action through this Extended Events session (Listing 21-8).
CREATE EVENT SESSION [CardinalityFeedback]
ON SERVER
    ADD EVENT sqlserver.query_ce_feedback_telemetry(),
    ADD EVENT sqlserver.query_feedback_analysis(),
    ADD EVENT sqlserver.query_feedback_validation(),
    ADD EVENT sqlserver.sql_batch_completed();
Listing 21-8

Observing CE feedback

There are a limited number of scenarios that are likely to lead to a situation where (CE) feedback can help. The first of these is in dealing with data correlation. In the old cardinality estimation engine (compatibility 70), prior to SQL Server 2014, the assumption was that there is no correlation between columns within a table. The modern cardinality estimation engine (compatibility 120 or greater) assumes a partial correlation between columns. There are of course situations where there is direct correlation between the data in columns as well. CE feedback will look at if the data is underestimated or overestimated compared to the actual rows. It will then apply hints through the Query Store to get better row estimates for the plan.

This simple query from Microsoft’s documentation can illustrate the point (Listing 21-9).
SELECT AddressID,
       AddressLine1,
       AddressLine2
FROM Person.ADDRESS
WHERE StateProvinceID = 79
      AND City = N'Redmond';
Listing 21-9

Filtering on two columns in the table

I’ve seen the CE feedback fire on the first execution of the query, but usually on the 16th. If you are attempting this on your own, you may have to execute it more than once. The idea being if a query is only called a single time, no need to try to adjust its behavior. However, ultimately, you’ll see CE feedback events within the Extended Events session, as shown in Figure 21-7.

A table has two columns and four rows. The headers are name and timestamp, and four row entries.

Figure 21-7

CE feedback events caused by a query

The first event to fire is query_ce_feedback_telemetry, and the data it returns looks like the one shown in Figure 21-8.

A table of details about the event query telemetry has two columns and 17 rows. The headers are field and value and 17 row entries.

Figure 21-8

query_ce_feedback_telemetry

This shows you the data that is being used as a part of the CE feedback process. The columns cover all three scenarios, not just the first one. I’ve included this for informational purposes. There’s not a lot here that you can use for evaluating your own queries or system behaviors. However, you can use it to have the data that Microsoft is using internally.

The next event is query_feedback_analysis, shown in Figure 21-9.

A table of details about the event query feedback analysis has two columns and seven rows. The headers are field and value and seven row entries.

Figure 21-9

query_feedback_analysis

Here, you get the hint that will be used to address the functionality. You also get the plan_id and query_id so you can pull the information straight from the Query Store.

Looking back at Figure 21-7, you’ll next see that the query was executed again. After that, you get query_feedback_validation in Figure 21-10.

A table of details about the event query feedback validation has two columns and seven rows. The headers are field and value and seven row entries.

Figure 21-10

query_feedback_validation

The important information here is the two hash values. In this case, they’re the same, showing that while this query was evaluated for CE feedback, no hint was applied at this time.

We can also use the catalog view sys.query_store_plan_feedback to see what’s going on with any queries that have been evaluated. Listing 21-10 shows an example query.
SELECT qsqt.query_sql_text,
       CAST(qsp.query_plan AS XML) AS queryplan,
       qspf.feature_id,
       qspf.feature_desc,
       qspf.feedback_data,
       qspf.STATE,
       qspf.state_desc
FROM sys.query_store_plan_feedback AS qspf
    JOIN sys.query_store_plan AS qsp
        ON qsp.plan_id = qspf.plan_id
    JOIN sys.query_store_query AS qsq
        ON qsq.query_id = qsp.query_id
    JOIN sys.query_store_query_text AS qsqt
        ON qsqt.query_text_id = qsq.query_text_id;
Listing 21-10

Retrieving data from sys.query_store_plan_feedback

The results of the query are here in Figure 21-11.

A table has six headers and one row. The headers are query s q l text, feature i d, feature description, feedback data, state, and state description, and row 1 first entry is highlighted.

Figure 21-11

Status of CE feedback on a query

The most important information here is the state_desc that lets us know that after evaluation, the plan was regressed back to its original.

The next condition applies to row goals. When a query specifies that a specific number of rows are going to be returned, such as when using the TOP operator, then the optimizer will limit all estimates to be below the goal count. However, in the case where data is not uniformly distributed, the row goal may become inefficient. In that case, the CE feedback can disable the row goal scan.

Listing 21-11 shows a query that will cause the evaluation of the CE feedback.
SELECT TOP (1500)
       bp.Name,
       bp.ProductNumber,
       bth.Quantity,
       bth.ActualCost
FROM dbo.bigProduct AS bp
    JOIN dbo.bigTransactionHistory AS bth
        ON bth.ProductID = bp.ProductID
WHERE bth.Quantity = 10
      AND bth.ActualCost > 357
        ORDER BY bp.Name;
Listing 21-11

Querying the bigProduct table with a TOP operation

Executing this 17 times does get me a query_ce_feedback_telemetry event. I also get an entry in sys.query_store_plan_feedback. However, the result is NO_RECOMMENDATION. The evaluator for the CE decided that even attempting a hint wouldn’t be worth it on this query.

The final scenario is related to join predicates. Depending on the CE engine, either 70 or 120 and greater, different assumptions are made regarding the correlation of join predicates. The old behavior was similar to compound keys, full correlation is assumed, and the filter selectivity is calculated and then the join selectivity. This is referred to as Simple containment. The modern CE assumes no correlation, base containment, where the join selectivity is calculated first and then the WHERE and HAVING clauses get added.

Listing 21-12 shows this type of CE feedback.
--intentionally using SELECT *
SELECT *
FROM dbo.bigTransactionHistory AS bth
    JOIN dbo.bigProduct AS bp
        ON bp.ProductID = bth.ProductID
WHERE bth.Quantity = 10
      AND bth.ActualCost > 357;
Listing 21-12

Querying the bigProduct and bigTransactionHistory tables

Figure 21-12 shows the output for the query_feedback_analysis event.

A table has two columns and seven rows. The headers are field and value, and seven row entries.

Figure 21-12

New feedback hint being evaluated

So there is the new hint, ASSUME_FULL_INDEPENDENCE_FOR_FILTER_ESTIMATES; in short, it tested whether simple containment would work better for this query.

You can disable CE feedback using ALTER DATABASE SCOPED CONFIGURATION SET CE_FEEDBACK = OFF. You can turn it back on the same way. You can also pass a query hint, DISABLE_CE_FEEDBACK.

If you have a hint in the query, you’re using Query Store hints, or you’re forcing a plan, CE feedback won’t occur. You can, however, override the hints using database scoped configuration again, this time setting FORCE_CE_FEEDBACK = ON.

Degree of Parallelism (DOP) Feedback

Parallel execution of queries can be extremely beneficial for some queries. However, there are other queries that suffer badly when they go parallel. As I discussed in Chapter 2, the best way to deal with getting the right plans to go, or not go, parallel is to use the Cost Threshold for Parallelism. Even then, some queries that exceed that threshold may still suffer from poor performance due to parallelism.

To see DOP feedback in action, Query Store must be enabled, and you must be at compatibility level 160 or higher. With these in place, you have to alter the database as shown in Listing 21-13.
ALTER DATABASE SCOPED CONFIGURATION SET DOP_FEEDBACK = ON;
Listing 21-13

Turning on DOP feedback

In order to see how the DOP feedback occurs, we can use Extended Events to set up a session as shown in Listing 21-14.
CREATE EVENT SESSION [DOPFeedback]
ON SERVER
    ADD EVENT sqlserver.dop_feedback_eligible_query(),
    ADD EVENT sqlserver.dop_feedback_provided(),
    ADD EVENT sqlserver.dop_feedback_reverted(),
    ADD EVENT sqlserver.dop_feedback_validation(),
    ADD EVENT sqlserver.sql_batch_completed();
Listing 21-14

Extended Events session for monitoring DOP feedback

Setting up a demo for DOP Feedback is complicated. It requires even more data than we’ve loaded into AdventureWorks so far. Instead of listing all the required code here, making the book artificially larger, I’m going to suggest you reference this example by Bob Ward: bobsql/demos/sqlserver2022/IQP/dopfeedback at master · microsoft/bobsql (github.com).

Running these examples takes a long time depending on the machine you’re using. However, eventually, you’ll get the example query to run a few times. At which point, our first Extended Event will fire, dop_feedback_eligible_query, with the output shown in Figure 21-13.

A table has two columns and five rows. The headers are field and value, and five row entries.

Figure 21-13

A query that may benefit from DOP feedback

This won’t immediately cause changes in behavior. Instead, through the use of the query_hash and plan_hash values, the engine is now tracking the behavior of this query. After a larger number of executions, the next event fires, dop_feedback_provided. The details are shown in Figure 21-14.

A table of the change in the query's D O P has two columns and nine rows. The headers are field and value, and nine row entries.

Figure 21-14

A suggested change to the DOP for this query

There is useful information here to help you track which query, which statement, the plan, etc. However, the interesting piece is the fact that the query DOP has been 8 and is now suggested to be better at 6.

Figure 21-15 shows the beginning of the testing process with dop_feedback_validation.

A table of the validation of the D O P change has two columns and nine rows. The headers are field and value, and nine row entries.

Figure 21-15

Validating that the suggested DOP change is working

You get the same basic information allowing you to track which queries are involved. Sadly, the performance didn’t improve adequately as several executions later, dop_feedback_reverted fires with the details in Figure 21-16.

A table of the D O P reverts after a failed validation has two columns and 11 rows. The headers are field and value, and 11 row entries.

Figure 21-16

A failed validation where the DOP reverts back

After five executions, performance didn’t improve, so the optimizer has reverted back to the last stable DOP, a value of 8. You can see that the query is running on average; slower during the test. The feedback_avg_cp_time_sec is at 235 and slower than last_stable_avg_cpu_time_sec of 210. The average elapsed time is also slower.

The engine will try this more than once. After it achieves an actual improvement, it will lock it in to place. The lowest it will go is a DOP value of 2.

Feedback Persistence

The ability to get feedback on memory, statistics, and degree of parallelism and then adjust query behavior based on the feedback is actually pretty wonderful. However, prior to SQL Server 2022, that feedback did not persist, meaning when a query went out of cache, any history of it having benefited from feedback was lost. Now, with Query Store, by default, the feedback is persisted to the Query Store. This means when a query is removed from cache, for whatever reason, when it gets recompiled, the existing feedback applied to the query is used again when compiling the new plan. This is a massive win in regard to performance with all the feedback processes.

Only feedback that has been validated is written out permanently. Otherwise, it’s in evaluation and won’t be persisted until it passes.

You can disable this behavior using database scoped configuration settings through MEMORY_GRANT_FEEDBACK_PERSISTENCE = OFF. When you set this to OFF, it’s off for all the feedback mechanisms. They’re all driven through this one setting.

Approximate Query Processing

Getting precise counts in aggregations is vital for some queries. Even though this accuracy may be costly, you’ll have to use it for some queries. However, other queries may not need a perfectly accurate accounting. There are three approximate functions you can use to get faster, if less accurate, data:
  • APPROX_COUNT_DISTINCT

  • APPROX_PERCENTILE_CONT

  • APPROX_PERCENTILE_DISC

Unlike a lot of the other aspects of Intelligent Query Processing, the approximate functions require a change to code.

APPROX_COUNT_DISTINCT

This first function, APPROX_COUNT_DISTINCT, is the easiest to explain and understand. It’s simply taking an approximation of data instead of a precise count. We can best explore this with two example queries, both in Listing 21-15.
SELECT COUNT(DISTINCT bth.TransactionID)
FROM dbo.bigTransactionHistory AS bth
GROUP BY bth.TransactionDate,
         bth.ActualCost;
GO
SELECT APPROX_COUNT_DISTINCT(bth.TransactionID)
FROM dbo.bigTransactionHistory AS bth
GROUP BY bth.TransactionDate,
         bth.ActualCost;
GO
Listing 21-15

Comparing COUNT and APPROX_COUNT_DISTINCT

The performance here is slightly better for the APPROX_COUNT_DISTINCT. The goal for this function is to deal with large data sets, in the millions of rows, and to avoid spills that are common with a COUNT(DISTINCT…) query.

Documentation says that a 2% error rate is likely with a 97% probability. In short, it’s going to be close in places where close works well enough.

APPROX_PERCENTILE_CONT and APPROX_PERCENTILE_DISC

PERCENTILE_CONT is used to calculate a percentile over a continuous distribution. Therefore, APPROX_PERCENTILE_CONT is a way to get the same percentile, but by doing it this way, it will result in less overall accuracy. PERCENTILE_DISC is used to return a specific value across a distribution based on the input. APPROX_PERCENTILE_DISC is once more an approximation to help with performance. Listing 21-16 shows two examples that look for the median value across a data set.
SELECT DISTINCT
       bp.NAME,
       PERCENTILE_CONT(0.5)WITHIN GROUP(ORDER BY bth.ActualCost) OVER (PARTITION BY bp.NAME) AS MedianCont,
       PERCENTILE_DISC(0.5)WITHIN GROUP(ORDER BY bth.ActualCost) OVER (PARTITION BY bp.NAME) AS MedianDisc
FROM dbo.bigTransactionHistory AS bth
    JOIN dbo.bigProduct AS bp
        ON bp.ProductID = bth.ProductID
WHERE bth.Quantity > 75
ORDER BY bp.Name;
GO
SELECT bp.NAME,
       APPROX_PERCENTILE_CONT(0.5)WITHIN GROUP(ORDER BY bth.ActualCost) AS MedianCont,
       APPROX_PERCENTILE_DISC(0.5)WITHIN GROUP(ORDER BY bth.ActualCost) AS MedianDisc
FROM dbo.bigTransactionHistory AS bth
    JOIN dbo.bigProduct AS bp
        ON bp.ProductID = bth.ProductID
WHERE bth.Quantity > 75
GROUP BY bp.NAME
ORDER BY bp.Name;
GO
Listing 21-16

Comparing the functions and the approximate functions

The most important aspect of the APPROX* functions is that you are not guaranteed the same result set from one execution to another. They both use a random sampling process that means they won’t be working from the same data set every time.

Execution times here are pretty severe. The first query runs about 43 seconds. The second runs in 3.2. Reads for the first query are 303,481. The second query has a varying number of reads as it randomly samples the data. However, generally, it was half, about 150,000 reads.

Interestingly, looking at the data, the approximate functions were fairly close as you can see in Figure 21-17.

Two tables have three columns and five rows each. The headers are name, median cont., median disc, and five row entries.

Figure 21-17

Accuracy from the approximate functions

This is consistent with Microsoft’s claim that this function will be 1.33% accurate. At least based on the values in the figure, it’s doing well.

Table Variable Deferred Compilation

The strength of table variables is the fact that they do not have statistics. This makes them a great choice in scenarios where statistics are not needed and would be painful due to maintenance overhead and recompiles. However, the one weakness of table variables is that they don’t have statistics, so the optimizer just makes assumptions about how many rows will be returned: one (1) row. With deferred compilation, the plan involving a table variable isn’t completed until actual row counts are available to provide better choices to the optimizer, similar to how temporary tables work, but without the added overhead of recompiles.

To be able to take advantage of deferred compilation, the compatibility level has to be 150 or greater. Also, you can use database scoped configuration to disable deferred recompile, DEFERRED_COMPILATION_TV = OFF. That must be enabled.

To see this in action, I’m going to run the same query twice in Listing 21-17.
--Disable deferred compilation to see the old behavior
ALTER DATABASE SCOPED CONFIGURATION SET DEFERRED_COMPILATION_TV = OFF;
GO
DECLARE @HeaderInfo TABLE
(
    SalesOrderID INT,
    SalesOrderNumber NVARCHAR(25)
);
INSERT @HeaderInfo
(
    SalesOrderID,
    SalesOrderNumber
)
SELECT soh.SalesOrderID,
       soh.SalesOrderNumber
FROM Sales.SalesOrderHeader AS soh
WHERE soh.DueDate > '6/1/2014';
SELECT hi.SalesOrderNumber,
       sod.LineTotal
FROM @HeaderInfo AS hi
    JOIN Sales.SalesOrderDetail AS sod
        ON sod.SalesOrderID = hi.SalesOrderID;
GO
--Enabled deferred compilation
ALTER DATABASE SCOPED CONFIGURATION SET DEFERRED_COMPILATION_TV = ON;
GO
DECLARE @HeaderInfo TABLE
(
    SalesOrderID INT,
    SalesOrderNumber NVARCHAR(25)
);
INSERT @HeaderInfo
(
    SalesOrderID,
    SalesOrderNumber
)
SELECT soh.SalesOrderID,
       soh.SalesOrderNumber
FROM Sales.SalesOrderHeader AS soh
WHERE soh.DueDate > '6/1/2014';
SELECT hi.SalesOrderNumber,
       sod.LineTotal
FROM @HeaderInfo AS hi
    JOIN Sales.SalesOrderDetail AS sod
        ON sod.SalesOrderID = hi.SalesOrderID;
Listing 21-17

Executing without and with deferred compilation

The SELECT statement from the first query results in the execution plan in Figure 21-18.

A diagram begins from the right with clustered index seek at 50 percent, 0.011 seconds, 4072 of 4 at 101800 to compute scalar at 0% and table scan at 50, 0.000 seconds, 1711 of 1 at 171100 to inner join nested loops at 0, 0.012 seconds, 4072 of 4 at 101800% to compute scalar at 0%, ends with the selection at 0 percent cost.

Figure 21-18

Row counts reflecting the estimate of one row

If you look at the Table Scan, you can see how there were 1,711 rows returned after an estimate of one. Figure 21-19 shows the plan from the second query.

A diagram begins from the right with clustered index seek at 91, 0.005 seconds, 4072 of 6597 at 61 to compute scalar at 0 and table scan at 2, 0.000 seconds, 1711 of 1711 at 100 to inner join nested loops at 6, 0.008 seconds, 4072 of 6597 at 61 to compute scalar at 0, ends with the selection at 0 percent cost.

Figure 21-19

Deferred compilation results in more accurate row counts

While the plan shape in this example didn’t change, you can see that the plan is now based on accurate row counts. The Table Scan now shows an estimated value of 1,711 as well as the actual identical count.

Deferred compilation doesn’t increase recompile frequency. You may not always see improvements in performance. In fact, if you have a wildly varying row count for your table variables, you may see no benefit from deferred compilation at all.

Scalar User-Defined Function Inlining

User-defined functions can often be a source of problems. I talked earlier about multi-statement, table-valued functions, user-defined functions (UDF), and their inherent problems (partially addressed through interleaved execution). While inline table-valued functions can perform just fine, they can also be abused. Scalar UDFs are also something that can be a fine tool or can be abused.

The issue with scalar functions is that by their nature, they have to be applied to each row in the query. Some scalar functions, such as simple arithmetic, formatting, and something similar, will perform perfectly fine. Other scalar UDFs, such as ones that independently access data, can perform quite poorly.

The poor performance comes from multiple sources. As already mentioned, each row will get an execution of the function in an iterative fashion. Since scalar functions aren’t relational, they are not properly costed by the optimizer, which can lead to poor choices. Also, each UDF statement is compiled independently of the rest of the query, potentially missing out on optimizations. Finally, UDFs under the old system are executed serially, with no parallelism, further degrading performance potential.

Starting in SQL Server 2019 (which means a compatibility level of 150 or greater), scalar UDFs are automatically transformed into scalar subqueries. Because they are subqueries, they can be optimized with the rest of the query. This means most of the standard issues with scalar functions are eliminated.

Not all scalar functions can be transformed to inline. In fact, there is a lengthy set of exceptions outlined in the Microsoft documentation here: https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/scalar-udf-inlining?view=sql-server-ver16.

Let’s take an example from AdventureWorks shown in Listing 21-18.
CREATE OR ALTER FUNCTION dbo.ufnGetProductStandardCost
(
    @ProductID int,
    @OrderDate datetime
)
RETURNS money
AS
-- Returns the standard cost for the product on a specific date.
BEGIN
    DECLARE @StandardCost money;
    SELECT @StandardCost = pch.StandardCost
    FROM Production.Product p
        INNER JOIN Production.ProductCostHistory pch
            ON p.ProductID = pch.ProductID
               AND p.ProductID = @ProductID
               AND @OrderDate
               BETWEEN pch.StartDate AND COALESCE(pch.EndDate, CONVERT(datetime, '99991231', 112)); -- Make sure we get all the prices!
    RETURN @StandardCost;
END;
Listing 21-18

Scalary UDF dbo.ufnGetProductStandardCost

Since this is already created as a part of the AdventureWorks database, we can validate whether or not the function can be executed inline by running this query in Listing 21-19.
SELECT sm.is_inlineable
FROM sys.sql_modules AS sm
    JOIN sys.objects AS o
        ON o.OBJECT_ID = sm.OBJECT_ID
WHERE o.NAME = 'ufnGetProductStandardCost';
Listing 21-19

Validating whether a query can be executed inline

When a value of 1 is returned from sys.sql_modules, it means the scalar function can be executed inline.

To see it in action, we’ll run the same query twice as shown in Listing 21-20.
--Disable scalar inline
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = OFF;
GO
--not inline
SELECT sod.LineTotal,
       dbo.ufnGetProductStandardCost(sod.ProductID, soh.OrderDate)
FROM Sales.SalesOrderDetail AS sod
    JOIN Production.Product AS p
        ON p.ProductID = sod.ProductID
    JOIN Sales.SalesOrderHeader AS soh
        ON soh.SalesOrderID = sod.SalesOrderID
WHERE sod.LineTotal > 1000;
GO
--Enable scalar inline
ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON;
GO
--inline
SELECT sod.LineTotal,
       dbo.ufnGetProductStandardCost(sod.ProductID, soh.OrderDate)
FROM Sales.SalesOrderDetail AS sod
    JOIN Production.Product AS p
        ON p.ProductID = sod.ProductID
    JOIN Sales.SalesOrderHeader AS soh
        ON soh.SalesOrderID = sod.SalesOrderID
WHERE sod.LineTotal > 1000;
GO
Listing 21-20

Executing the scalar UDF within a query

Capturing the performance metrics results in the following:
Not inline
Duration: 897ms
Reads: 130821
Inline
Duration: 145ms
Reads: 130943
While there are slightly more reads with the inline scalar function, you can clearly see that performance was radically improved. It’s easy to see why when we take a look at the execution plans (Figure 21-20).

A window depicts the two different strategies for carrying out both query 1, query cost, relative to the branch at 14, and query 2, query cost, relative to the branch at 86 percent.

Figure 21-20

Inline scalar function vs. not inline

Don’t worry about trying to read the details of the plan. There are a few simple facts that are immediately apparent. First, the plan for the UDF executed without being inline has a missing index suggestion, which is absent from the other plan. This implies that the second plan’s optimization is probably better since all the indexes it needed were available to it.

Next, you can see that the second execution went parallel. This probably also helped to enhance performance.

Finally, and you’d have to break down the plan into detail to see it, the location of the production of the data from the UDF queries has moved. In the original plan, the queries necessary to satisfy the UDF are done at the very end of execution, whereas they’re done sooner throughout the second plan. Also, there is no Compute Scalar operator for the UDF in the second plan.

Obviously, since the plan has changed, things like hints, the query and plan hash, and dynamic data masking may all work differently. You may also see warnings that were previously masked inside the plan for the scalar function (by the way, that plan can be seen by looking at the Estimated Plan for a query using a scalar function).

While some queries may not benefit from this functionality, clearly, others will.

Summary

Intelligent Query Processing covers a lot of ground, from automated feedback, to changes in the query itself, to code changes that can better support performance. Don’t forget that as excellent as this functionality is, and it is excellent, it won’t fix bad code or incorrect structures. Standard query tuning is still going to be a part of the process. However, where you have common issues, there may be some solutions that simply occur automatically, thanks to Intelligent Query Processing.

Speaking of automated fixes, the next chapter will cover automated tuning in SQL Server and Azure SQL Database.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.127.57