© Tracy Boggiano and Grant Fritchey 2019
T. Boggiano, G. FritcheyQuery Store for SQL Server 2019https://doi.org/10.1007/978-1-4842-5004-4_9

9. Troubleshooting Issues with Query Store

Tracy Boggiano1  and Grant Fritchey2
(1)
Cary, NC, USA
(2)
Grafton, MA, USA
 
Like any other aspect of SQL Server, most of the time, you can simply turn Query Store on and then not worry about it. However, like any other aspect of SQL Server, things can go wrong. There are two tools that you can use to understand what is happening with Query Store on your database:
  • Query Store specific wait statistics

  • Extended Events for Query Store

This chapter will look at some common problems that can arise with Query Store and how you can use the tools provided in order to ensure that Query Store is working well on your systems.

Query Store Waits

By design, as we described in the first couple of chapters, Query Store is designed to be as unobtrusive as possible. In physics, the observer effect says that observation of a process or object can affect the behavior of that process or object. Similarly in SQL Server, however lightweight the data collection, the act of collecting information is adding some additional load to your SQL Server instance. The number of possible factors that can lead to issues with the Query Store is huge. It is the same list as any other process within SQL Server: the amount of memory and CPU you have available; the number, size, and volume of the transactions in your system; the number and speed of your disks and disk controllers, and, frankly, the code running in your T-SQL statements. Any or all of these and more can change the behavior of Query Store.

The first way we can understand Query Store behavior is to use the wait statistics. Generally speaking, wait statistics are always the best way to gain a general understanding of the behavior of any given system. If you know what the system is waiting on, you know where your bottlenecks are. There are two ways to look at waits in the Query Store, through the traditional dynamic management view, sy.dm_os_wait_stats, or, on Azure SQL Database, sys.dm_db_wait_stats. You can also use Extended Events wait_completed to see waits, but that’s a very granular approach and won’t generally be needed. We’ll focus only on the DMV.

When talking about querying the wait statistics within SQL Server, I recommend you start with Paul Randal’s scripts. The import of these scripts is that they eliminate the noise, the wait statistics that never mean anything. You can access those scripts here: https://bit.ly/2wsQHQE .

Query Store wait statistics have a common naming convention. They always start with “qds_”. To see just the wait statistics coming from Query Store, you would run this query in Listing 9-1:
SELECT dows.wait_type,
       dows.waiting_tasks_count,
       dows.wait_time_ms,
       dows.max_wait_time_ms,
       dows.signal_wait_time_ms
FROM sys.dm_os_wait_stats AS dows
WHERE dows.wait_type LIKE 'qds_%';
Listing 9-1

Wait Statistics coming from Query Store

The results would look something like Figure 9-1:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig1_HTML.jpg
Figure 9-1

All the Query Store wait statistics

Of course, your system will have different values in each of the columns, but you should see a similar list of waits. You would not normally look at the Query Store wait statistics in isolation. Instead, you would query the wait statistics and look for the top ones on your system. When those top waits are Query Store, you’re likely experiencing some sort of issue.

However, there are just a few Query Store waits that you can safely ignore. Paul Randal also maintains a library of wait statistics (accessible here: https://bit.ly/2ePzYO2 ). Based on his documentation, there are three waits, currently, that you can safely ignore when it comes to Query Store:
qds_async_queue
qds_cleanup_stale_queries_task_main_loop_sleep
qds_shutdown_queue

You would not simply query for the Query Store waits. Instead, as part of your normal monitoring of wait statistics on your system, you would look for the waits above because they would indicate Query Store itself, its operations, is leading to issues on your system. If you see any of the “qds_∗” waits in your top 10, you would want to understand what that wait statistics is and then drill down on why it’s causing problems for you in your system. The way we drill down when monitoring Query Store is to use Extended Events.

Extended Events and Query Store

Extended Events are the best method to do a detailed analysis of the behavior of Query Store in your systems. Extended Events are a fairly dense topic that we’re not going to try to explain in detail in this book. To get started with Extended Events, I suggest first the Microsoft documentation here: https://bit.ly/2LfWMoj . Once you’re comfortable with how Extended Events works, you can more easily explore the capabilities of monitoring Query Store using Extended Events.

At this writing, there are currently 92 events within the qds, Query Data Store, package defined within SQL Server 2019. You can easily list them if you want to run a query as follows in Listing 9-2:
SELECT dxo.name,
       dxo.description
FROM sys.dm_xe_packages AS dxp
    JOIN sys.dm_xe_objects AS dxo
        ON dxp.guid = dxo.package_guid
WHERE dxp.name = 'qds'
AND dxo.object_type = 'event';
Listing 9-2

List Query Data Store Extended Events

Another way to access the information is through the Extended Events GUI within SQL Server Management Studio as shown in Figure 9-2:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig2_HTML.jpg
Figure 9-2

Selecting only the Query Store events in Extended Events

The only issue here is that not all these events are active. Many of them, maybe even most of them, are only accessible through special flags set by Microsoft. So, while you can see them, even add them to a session, no activity will ever be recorded by them.

As is normal with every other event in Extended Events, the Query Store events will have a name, a description, and a defined set of columns associated with the event. So, if we were to monitor the behavior of Query Store using Extended Events, one event we might be interested in is the query_store_size_retention_cleanup_finished event. Selecting that event, we can then see the information that Query Store itself is gathering as you can see in Figure 9-3:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig3_HTML.jpg
Figure 9-3

The information captured by query_store_execution_runtime_info

The information in this event represents the data that the Query Store is capturing after a query completes its cleanup process. So, with this event, you could observe and monitor that part of the behavior of the Query Store. We’re going to look at a couple of different ways you can monitor Query Store behavior and some very particular events that aid in that regard.

Tracking Query Store Behaviors

A number of processes that make up the Query Store can be observed by capturing Extended Events. We’ve already see an event that fires when Query Store is cleansing itself for size. You can also watch a matching event, query_store_size_retention_cleanup_started to see when this event begins. For general observation of Query Store behavior, Listing 9-3 contains a set of events that, as of this writing, I know will fire to produce information about the standard behavior of Query Store:
CREATE EVENT SESSION QueryStoreBehavior
ON SERVER
    ADD EVENT qds.query_store_background_task_persist_finished,
    ADD EVENT qds.query_store_background_task_persist_started,
    ADD EVENT qds.query_store_capture_policy_evaluate,
    ADD EVENT qds.query_store_capture_policy_start_capture,
    ADD EVENT qds.query_store_database_out_of_disk_space,
    ADD EVENT qds.query_store_db_cleared,
    ADD EVENT qds.query_store_db_diagnostics,
    ADD EVENT qds.query_store_db_settings_changed,
    ADD EVENT qds.query_store_plan_removal,
    ADD EVENT qds.query_store_size_retention_cleanup_finished,
    ADD EVENT qds.query_store_size_retention_cleanup_started
    ADD TARGET package0.event_file
    (SET filename = N'C:ExEventsQueryStorePlanForcing.xel', max_rollover_files = (3))
WITH (TRACK_CAUSALITY = ON);
Listing 9-3

Extended Event Session producing information on the standard behavior of Query Store

These events capture some of the behaviors of the Query Store. If you create this session, start it, and watch the Live Data window in SSMS, we can actually see some of the events fire if we manipulate queries and the Query Store itself. Let’s fire a few commands to see this in action. To start with, we’ll clear the Query Store with the code in Listing 9-4:
ALTER DATABASE AdventureWorks SET QUERY_STORE CLEAR;
Listing 9-4

Clear Query Store

This will fire the event, query_store_db_cleared, as we see in Figure 9-4:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig4_HTML.jpg
Figure 9-4

query_store_db_cleared event captured for a given database

The first four columns are for managing the Extended Events causality tracking, showing the distinct grouping of events and the order in which they occur. We can ignore that in this instance. What we see are the clear_all setting, which is defined in the QUERY_STORE CLEAR command, which can be set to clear ALL.

To see the next event, we’ll run the following stored procedure in Listing 9-5 from the previous chapter:
EXEC dbo.AddressByCity @City = N'London';
Listing 9-5

Call stored procedure

We’ll get the following two events, related and in the order in which they are firing as shown in Figure 9-5:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig5_HTML.jpg
Figure 9-5

Query Store evaluates a query and then captures it

I’m showing the details of the query_store_capture_policy_evaluate event. You can see that the capture_policy_result is showing the result of the evaluation, in this case, CAPTURE. You’ll also frequently see UNDECIDED for queries that have been evaluated but are not going to be captured.

You may see an intermittent event showing how Query Store self-evaluates on a regular basis as shown in Figure 9-6:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig6_HTML.jpg
Figure 9-6

Settings and diagnostics of Query Store

You can see the current settings for Query Store, such as the plans per query limit, currently set at 200. You also get to see the plans and plan use being managed by Query Store. These numbers will obviously change over time, showing the behavior of Query Store over time.

Let’s take a look at one more of the events in action. Run the following code in Listing 9-6 to remove a plan from the plan cache:
DECLARE @PlanID INT;
SELECT TOP 1
        @PlanID = qsp.plan_id
FROM    sys.query_store_query AS qsq
JOIN    sys.query_store_plan AS qsp
        ON qsp.query_id = qsq.query_id
WHERE   qsq.object_id = OBJECT_ID('dbo.AddressByCity');
EXEC sys.sp_query_store_remove_plan @plan_id = @PlanID;
Listing 9-6

Remove a plan from cache

We can then see this event as shown in Figure 9-7:
../images/473933_1_En_9_Chapter/473933_1_En_9_Fig7_HTML.jpg
Figure 9-7

Plan removed from Query Store

You can see the database, the fact that plan was successfully removed, and the plan being removed.

All these events give you a good idea of what’s happening within Query Store, allowing you to see how Query Store behaves.

Conclusion

While you really should just see normal behavior from Query Store the vast majority of the time, it is possible for things to go wrong. Knowing how to monitor Query Store itself ensures that you have a higher degree of confidence that Query Store is supporting and helping you, not hurting you. Wait statistics are going to be the principal mechanism of assuring good Query Store behavior. You can drill down on the details of that behavior using Extended Events to see exactly how the system is working. These processes should enable you to protect your systems and better understand the behavior of Query Store.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.240.185