© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
G. FritcheySQL Server 2022 Query Performance Tuninghttps://doi.org/10.1007/978-1-4842-8891-7_1

1. Query Performance Tuning

Grant Fritchey1  
(1)
Grafton, MA, USA
 

Query tuning is not easy.

Yes, the goal of this book is to give you as many tools and as much knowledge as possible to make query tuning easier. However, it still won’t be easy. In fact, the one easy way to make things run faster is to simply spend more money. It’s a lot easier to buy bigger hardware and more of it. It’s much easier to move to a higher service tier with your cloud provider. These solutions are the easy way to get performance to increase. Eventually though, there’s a limit to how much you’re going to be able to spend in order to fix your performance problems. Then, the query performance tuning work starts.

You may be working in an environment that is purely run traditionally on large servers maintained in your building. On the other hand, you could be running your databases as a service through Google Cloud SQL, Amazon RDS, or Azure SQL Database. It’s also possible that you are running through a virtualized environment on containers managed by Kubernetes or on more traditional virtual machines (VM). You might even be running SQL Server on the Edge, inside a bunch of little Raspberry Pi machines. In any of these environments where the SQL Server engine is running, when it comes time to deal with poor performance, the information provided in this book is going to help. In fact, when it comes to making many of these environments run faster, your only hope may be to know how to pick the best indexes, write code that performs well, and generally understand the tools that are going to help you wring more performance out of a limited resource.

What everyone wants right here is a simple formula: If you see “A,” do “Z.” Sadly, I’m not going to give you a simple formula. Instead, we’re going to focus on two primary things throughout this book: understanding of how things work within the SQL Server engine and a methodology for identifying what exactly has gone wrong. Here are a few of the topics we are going to cover to help you get to that understanding and methodology:
  • Use of Extended Events to collect information about query behavior, above and beyond anything you’ve been able to see before

  • Mechanisms for identifying and dealing with blocking within your queries

  • Understand what an execution plan is and how to go about pulling information from it to help you tune queries

  • Learn how statistics work and how you can actually use them to assist performance

  • Use modern indexing techniques such as the Columnstore to make queries fly

  • Understand the causes of recompilation and how to deal with it when it becomes a problem

  • Take advantage of the Query Store to both identify poorly performing queries and fix those queries

  • Develop knowledge of how the optimizer can automatically help you with some query performance

  • Learn how to implement a query tuning methodology that helps you make your queries run faster

While I would love to dive straight into any one of these topics, we’re first going to talk about the fundamentals behind query performance tuning. While you can turn to any chapter in the book or hit the index to look up a topic that you think may help immediately, trying to tune queries without core concepts on how to go through the process and where to focus your efforts, you may not find the book as useful. Instead, we’re going to spend this chapter talking through a few key points to set the general tone for the rest of the chapters:
  • The query performance tuning process

  • Understanding what defines “good enough”

  • Establishing comparison points

  • What are the most likely performance issues?

With the understanding of the generally accepted approach defined here in this chapter, we can then use that approach through all the rest of the chapters of the book as a framework around which we build both our understanding and our abilities in performance tuning.

There have been previous editions of this book. This edition represents a complete teardown and rebuilding of all that has gone before. Many of the topics will be the same because most of the fundamentals haven’t changed. Some of the examples may even be the same because they illustrate the point well. However, anyone looking at this from a prior edition is probably already aware that we’re not just looking at a few edits and one or two inserts. This is a new book.

At this point in previous editions, I spent a little time talking about what we won’t be covering. Instead, I just want to focus on what is included in the book. For topics other than query tuning, look to other books.

Some of the examples in this book will be written from scratch, using generated tables, provided with those examples. However, many of the examples will be using the AdventureWorks database. I know that example database is actually quite old at this point in time. However, all the errors in data types and bad data distribution make it an excellent resource for showing how to identify and fix poorly performing queries, which is why I use it throughout the book. The AdventureWorks database does change a little over time, so you can’t guarantee that your behavior will match mine. Go here to download a copy: https://docs.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-ver15&tabs=ssms.

Let’s move on and discuss the query performance tuning process.

The Query Performance Tuning Process

To begin tuning queries, you first have to identify which particular query needs your attention. You could rely on people telling you that they’re experiencing pain while using your database, but a better approach is to be proactive. This means you need to first have a method for monitoring query performance. From there, you can identify the queries that are causing the most problems. With the query identified, you’ll have to troubleshoot the potential causes of the slow performance. You can apply various mechanisms to resolve these issues. Then you have to quantify the performance after your changes. That’s the complete process. What you’ll find is that you repeat this process over and over.

Since no one thing will always fix performance problems, you need to be ready to experiment with various changes and solutions, measure their effects, and then decide which of them is helping the most. You will, over time, identify common issues with your code and structures, making things easier to address. You are also going to run into completely unique problems.

Performance Issues

Performance tuning is not just about queries and indexes. Your hardware, operating system, cloud provider, network, and other applications can all negatively impact your servers. We’re going to be focusing on query tuning and the query tuning process throughout this book, but you do need to know about some of the other issues that could impact your performance. Possibilities could include things like the following:
  • An inadequate performance tier on your cloud provider.

  • Applications other than SQL Server consuming resources on your server.

  • Configuration of the SQL Server service.

  • The capacity of the hardware, or the virtual machine, on which your server is running.

  • Network hardware or configuration issues.

  • Application code or reports running against the server may themselves be misconfigured or not optimized.

  • The database design itself may not be optimized.

  • You may be facing a situation where the relational data structure is the wrong choice.

Any one of these issues can lead to poor performance that focusing on the queries just won’t help. While we’re not going to be drilling down into these types of issues in the rest of the book, we can have a short discussion about how to deal with these issues here.

While the cloud is an excellent place to host databases and performance can be simply excellent, you have to make choices when configuring your systems, and those choices can be wrong. In order to deal with this issue, you will have to take advantage of the tools offered by the cloud provider. Whether we’re talking about AWS or Azure, there are tools for measuring performance that are unique to each provider. You’ll need to identify these tools and use them to understand if you’re simply running out of resources on the service tier on which your database is operating. Conversely, you’ll also want to use those tools to ensure that you’re not using too high a service tier. It’s not only possible to hurt performance, but you can hurt your company’s bottom line by paying too much and using too little. Take some to learn how your cloud service provider supplies mechanisms for monitoring performance so that you can achieve a “just right” solution.

In this age of virtualized operating systems and containers, you may be in a situation where the only thing running within your VM or container is SQL Server. However, you may find that there are other virtual machines or other containers using up the resources. You may also find that another application is installed on your server, virtual or not, and it’s using up all the resources. When SQL Server is waiting on access to resources, so are all the queries running on your system. This is why identifying what else may be running with your SQL Server instance is so important.

We will talk about a few server settings that are very directly related to query performance throughout the book. However, there are a number of configurations you could mess up on the SQL Server instance that can indirectly hurt performance. For example, misconfiguring memory can easily starve queries of the resources that they need. Learn how to use the system stored procedure sp_configure to take direct control of the system configurations. You can also query the configuration settings through the system table sys.configurations in the master database. SQL Server Management Studio has mechanisms for controlling SQL Server configurations. For the most part, in the majority of situations, the default settings within SQL Server should be adequate for good performance of the system. I will make a few suggestions on changes you may want to introduce as we go through the book. Following Microsoft best practices on most settings is the right way to go.

Ensuring that your servers are not overwhelmed while managing your data is very important. While query tuning is an important aspect to performance tuning in general, you first need to understand how your servers are working. Whether we’re talking about a traditional install of SQL Server, a virtual machine (VM), or a Docker container, you’ll want to learn how to capture performance metrics on each of these. You’ll also need to plan for capturing vendor-specific metrics. For example, while both VMware and Hyper-V are hypervisors letting you virtualize your SQL Server instances, the underlying metrics defining behavior for each are very different. The same thing goes with VMs in the cloud. Focus on collecting metrics on waits and queues, especially around disk I/O, memory, and CPU as these are the resources most likely to be stressed as your data loads grow.

In addition to managing your hardware and virtual machines, you’ll need to worry about your network. This is just about your Internet download or upload speeds. It’s also about the health of the routers, cables, and Wi-Fi repeaters you may have in use in your organization.

After talking about hardware, we have to mention that software can also be a problem, specifically when custom-built applications are not optimized in how they deal with the database. You could experience almost any issue from incorrect drivers causing connection timeouts to code that doesn’t properly commit transactions that it opens, holding resources on your servers and negatively impacting performance all over the place. In addition to understanding the queries running on your system, don’t forget to develop an understanding of the applications that are calling those queries. Also, how those applications are developed matters. Object Relational Mapping (ORM) software can be a huge boon to the speed of development. However, improperly configured, an ORM tool can introduce nightmare queries to your system. Work with your development teams to ensure they follow best practices in coding the applications.

It’s also not uncommon for databases to simply be built poorly for the needs of the data and the application. Improper or incomplete data normalization can cause poor performance. Again, ORM tools can build databases that look like objects, which perform quite poorly, even as they make development easier. Take the time to evaluate your data structures to ensure you’re designing the databases well, that data types are correct, and that your structures support the workload you’re placing on them.

Finally, while SQL Server is an absolutely wonderful tool, it’s not applicable to every situation. You may find that you’re collecting data through massive Internet of Things (IoT) systems where unstructured or semi-structured information is being collected faster than SQL Server can process it. Sometimes, the right choice isn’t to attempt to tune a query, a database, or a server. Instead, it’s to change to a more appropriate data storage mechanism. Understand the business and technical needs of the data you’re attempting to build a system to manage. You may be better served with an ID/Value database, a search engine, or some other type of nonrelational data store.

If all this sounds like a lot of work, it is. The best place for this work to occur is as you design and build your applications and databases. It’s much harder to make the structural and code modifications after the system is already in production. While that may be the ideal, you’re regularly going to be dealing with existing systems and attempting to fix them. Just remember that the best thing you can do is understand how your systems are behaving in order to identify the true bottlenecks. With the bottleneck identified, you can begin the process of addressing it.

A Repetitive Process

Performance tuning is something that you will repeat on any given system. On some systems, you’re likely to repeat the process quite a lot as the data changes, your workload grows, and new applications are introduced. The core of the process is simple enough:
  • Identify the bottleneck.

  • Fix it.

  • Validate the fix.

  • Measure the impact and current performance.

  • Start again with the next bottleneck.

One of the best habits you can develop is to change only one thing at a time. Any change to structure or code can have impact beyond the immediate code you’re working with currently. If you can limit your changes to one at a time, your ability to measure and understand the impact of those changes is enhanced.

You may decide that adding an index will radically improve performance on a query, but it could cause other queries to run slower. The slowdown of other queries could be because they’re now using that index and it’s just not as good for them, or because they are data modification queries and now another index must be modified as data gets added, changed, or removed from the system. Chapters 7 and 8 will help you deal with indexes. These types of issues are why it’s very important to have a test or development system on which you can perform your code changes without negatively affecting the production system. Chapter 22 covers information about setting up automated database performance testing.

Tuning a single query will seldom fix all the performance issues on a system. Initially, you’re likely to identify common problems across a large set of your queries. Addressing all these issues will help to clear a large percentage of your performance problems. At which point, before proceeding with simply trying to tune every query, you should establish best practices for query performance. One system I worked on in the past had a requirement that every query had to meet a three-second minimum operation. Happily, most of the queries were well below this. A few exceptions were allowed above it. Having set the minimum allowed us to focus our work on the worst offenders. This example is not a rule you should necessarily follow. It’s just meant to convey how you can establish a minimum that’s easy enough to clear. Once every query is running this fast, your system may need more help, and a new minimum can be established.

All of this is just meant to emphasize the repetitive nature of query tuning. As the number of people using your system grows, you may see new needs for performance tuning. When the data stored within the database expands, additional query degradation can occur. As the data simply changes over time, you can see similar changes to the behavior of the queries. All of this will lead you to need to do additional query performance tuning. Figure 1-1 shows the general nature of the process over time using decision points based on your performance metrics and the monitoring of your system.

A flowchart begins with setting a performance target for the application and analysis of it, identifying bottlenecks, ensuring proper configuration of various components, identifying the costliest query associated with the bottleneck, and optimizing it.

Figure 1-1

The core performance tuning process

The good news is for a lot of systems, all you really need to do is ensure you have a good system configuration and appropriate database settings. Then add some mechanism for capturing performance over time (covered in Chapter 3). Then, you’ll need to go through the actual query tuning process as outlined in Figure 1-2.

A flowchart begins with baseline performance and resource use of the costliest query, set performance target for the query, analysis, and optimization of factors, query for common problems, and opportunity for query tuning and index changes.

Figure 1-2

The core query tuning process

While this process may seem complex, with practice, you’ll find you can move through the steps and decision points rather quickly. The key is to take the time to understand the code, examine the execution plans if necessary, and then carefully assess your opportunities for improving performance. Testing the changes that they made to ensure actual improvements from your changes is also vital. Adopting a structured approach to your query tuning will ensure a higher degree of positive results instead of just poking at things.

Understanding What Defines “Good Enough”

The query optimizer uses a “good enough” method of optimization (we’re going to cover the optimizer in Chapter 2). Instead of trying to get a perfect level of optimization for every query, it instead attempts to get enough optimization, without too much optimization. This is going to be the best approach for your query tuning. You simply won’t be able to tune every query to the last possible millisecond of performance. There are simply too many databases with too many queries for any of us to do that. Instead, as I said in the last section, you’re going to need to define what looks like “good enough” for your systems. There is no single right answer to this question. A query that runs in ten milliseconds is great, until that query gets called thousands of times in a minute. Then, shaving a millisecond or two from the query could be a lifesaver for your system. Yet another query runs in about 30 milliseconds, and it supports the system nicely. If it’s only called a few times an hour, shaving a millisecond of performance off that query is absolutely not worth your time.

The key is time. Query tuning is a time-intensive undertaking. After you set up mechanisms for query monitoring (covered in Chapter 3 in detail, as well as several other chapters), you need to have some time to consume the data collected. Then, after you identify the query you intend to tune, you’re going to spend quite a bit of time reading through the query in order to understand what it’s doing, how, and why. Then, further time will be spent looking at the execution plan for the query (covered in Chapter 6) to understand if there are tuning opportunities. Finally, you’re going to have to spend time testing solutions before you implement them in your production server.

Here is why you’re not going to waste time getting an extra millisecond of performance out of every query. Only the ones that truly matter to your system are where you’re going to spend time.

Establishing Comparison Points

In order to achieve the goals you set for your system and arrive at your “good enough” query tuning, you have to establish comparison points. Sometimes, people talk about establishing a baseline of behavior of your queries. However, you really don’t have to have a perfect set of measures and metrics in order to identify poorly performing queries. All you need is the ability to compare before and after you tune your query, using any method you can.

The comparison points you need can include the following:
  • A thorough measure of the queries on your system, possibly down to the individual statements on every database. You can then have detailed analysis to see if your query tuning has paid off.

  • A measure of the performance of the application or reports that are calling your database. Their round-trip time frequently is all anyone cares about, so it’s a good measure for query performance.

  • An immediate look at the recent behavior of queries in the system using information stored in plan cache, a memory space in your SQL Server instance. While not a detailed measure of performance, it does give you a comparison point.

  • An immediate run of the query to see what performance looks like. Not recommended for systems under already-serious stress and certainly not for data modification queries, but if you can measure the query behavior, it gives you a comparison point.

  • Running the query on a nonproduction system. While the full range of behaviors of a production environment may not be possible in other environments, a slow running query is still a slow running query.

Once you have a single point of measurement, your goal is then to perform the changes that you think will improve performance and then measure again so that you can compare the two behaviors. This comparison is the most important part of validation of your work. Without validation of the work you’ve done, you can’t be sure you’re making a real positive impact on your production systems.

Since you only have to really have two points for comparison purposes, this makes things much easier. You don’t have to have a flawless data set to begin working on a query. The simplest initial measurement will enable you to understand the behavior of the system.

Also, when we talk about comparison points, I want to start emphasizing early on that you need to measure more than just the time a query takes to process. Certainly, the timing of the query is probably the most important aspect of query tuning. However, our systems can bottleneck on CPU use, or disk I/O or memory. So we need to measure these metrics as well. When we’re tuning, we are going to want to see if the CPU use was reduced as part of our tuning. Same goes for all the other metrics I just mentioned. In fact, in some cases, you will find if you’re experiencing bottlenecks in a resource, making a query use less of that resource may be more important than making it run faster. You may even find letting a query run slower, but reducing the amount of I/O or memory it uses can be a win on your systems. It all depends on where the problems are.

Throughout the book, I’ll be showing more than one measurement and pointing out how any given tuning technique affects not just query time but CPU and all the rest.

Most Likely Performance Issues

As I said at the start of the chapter, there is not a simple query tuning formula. I truly wish there was and that I could be the one to give it to you. However, there are a number of very common problems. Most of those problems also have a well-defined set of solutions.

The good news here is there really are a few areas that are more important for you to spend time on. Figure 1-3 shows the results of a survey conducted several years ago by Paul Randal. It’s still absolutely applicable today.

A table has 4 columns and 10 rows, where the entries on the first column are several root causes of poor S Q L server performance. The row entries under the third column are numerical values in percentage from lowest to highest.

Figure 1-3

Root causes of poor performance

The top two most common problems are the T-SQL code itself and your indexing. Further, you can see that the design of the data structure and out-of-date statistics are also in the top six problems. That means that fully 64% of all performance problems are directly related to the topics discussed in this book.

As I also said at the beginning of this chapter, you can spend your way out of performance problems, buying bigger, faster, and more hardware or using a higher service tier. You’ll still be dealing with the fact that 64% of your problems lie in your code, indexes, and statistics. In short, you can buy some of the wonderful hardware imaginable and still get bad performance if your code isn’t good enough.

Common Performance Issues

After you’ve spent the money to try to improve performance, you’re then going to turn to this list and focus your efforts on the most common performance problems. This is as close as we’ll get to a simple formula for your query tuning efforts. The most common problems are as follows:
  • Insufficient or poor indexes

  • Inaccurate or missing statistics

  • Bad T-SQL

  • Problematic execution plans

  • Excessive blocking

  • Deadlocks

  • Non-set-based operations

  • Incorrect database design

  • Poor execution plan reuse

  • Frequent recompilation of queries

These are not the only problems you’re likely to run into, they’re just the most common. We’ll address these in detail throughout the rest of the book, but let’s run through them now to discuss how these problems come about and how to solve them.

Insufficient or Poor Indexes

While the joke goes that all you ever have to do is put an index on a table to fix performance problems, the fact is a missing index truly is one of the biggest problems for performance in SQL Server. You’re also going to find that sometimes, while there is an index, it’s not the right index, or it’s an index with poor choices for keys. When there’s no index on a given table, SQL Server has no choice but to read through the entire table any time it’s attempting to filter information. This directly impacts disk performance and memory use. Further, you’re going to see a lot of resource contention because of these scans leading to blocking. Chapters 79 are focused on indexes and index strategies.

Since indexes are vital for even the possibility of well-performing code, you need to work across teams to ensure appropriate indexes on your tables. You want the code to have indexes to use to make it run faster, and you want to ensure that the indexes you have will be used by the code. Because of this, you can’t really divorce T-SQL coding from index creation. Instead, you need to ensure that the code and the indexes are created together.

While indexes are vital, you can actually create too many indexes. Every time you need to modify data through an INSERT, DELETE, or UPDATE operation, any index that has keys or INCLUDE columns from that data modification will also need to be updated. You can see massive performance hits on your data modification queries because they have to wait as indexes are updated. So while you do want to use and create indexes, you also must exercise restraint. This is where testing is going to come into play, and we’ll discuss that in Chapter 22.

Inaccurate or Missing Statistics

The query optimizer within SQL Server does a lot of calculations on how exactly to satisfy a given query based on the number of rows that it expects to see affected by the query. Those row estimates come directly from the statistics that exist on indexes and columns in your database. If those statistics are missing or badly out of date, the choices the optimizer makes can be very bad indeed. You can see scans of entire tables where a seek against an index could have helped performance. Conversely, you may see seeks when a scan would be more efficient. Without good row estimates, the optimizer just can’t make good choices.

Statistics are mostly created and maintained completely automatically by the system. This makes sense considering how important they are to the efficient operation of the system. However, depending on how the data in your system is changing, the statistics can be out of date. There are also ways to disable the automatic creation and update of statistics, leading to the statistics being missing or incorrect. There are also potentially other issues that can crop up around statistics. We’ll cover them in depth in Chapter 12.

Bad T-SQL

While indexes are going to help your queries run faster, and statistics will give the optimizer better row count estimates, you can completely negate all that with poor coding choices. Any number of problems can come from the code. You may be moving too much data, planning to filter it at the application. You may have written code in such a way that indexes can’t be used, leading to scans and excessive I/O. You can also overcomplicate your code in such a way that the optimizer has a hard time picking it apart and providing a good execution plan. Finally, you can incorrectly use the different object types within SQL Server, resulting in very poor performance indeed. We’ll discuss a bunch of these topics throughout the entire book and focus on mechanisms to ensure good coding practices in Chapter 23.

Problematic Execution Plans

There is no such thing as a bad, or incorrect, execution plan. Every execution plan generated by the optimizer will work. However, some execution plans are going to work better than others. While the optimizer is an amazing piece of software, it’s not perfect. Execution plans that are not optimal for a given query can cause really bad performance. Most of the time, execution plans are fixed through changes in code, statistics updates, or adding an index. Sometimes though, the execution plan is wrong for a given query due to a process known as parameter sniffing. Most of the time, parameter sniffing helps performance. However, parameter sniffing can go wrong. We’ll spend the entire Chapter 13 addressing bad parameter sniffing.

Note

It is possible for the optimizer to have an error on plan generation, but this is extremely rare and usually caused by a bug within SQL Server itself.

Excessive Blocking

SQL Server guarantees that the data stored within it will be consistent and accurate through the application of the ACID mechanism: Atomicity, Consistency, Isolation, and Durability. In a nutshell, changes being made to the database won’t be affected by other changes until they are complete. A query can only see data either before or after that modification is done. While there are ways to change this default behavior, they all still follow some basic rules.

This isolation of data modification allows multiple queries to access information in a shared fashion, without blocking. However, when more than one query attempts to access data that is being modified, or held for modification, you’re going to see one of them waiting on the others. This is blocking in action. Blocking is exacerbated a little because SQL Server stores information on 8KB pages on the disk. Two processes that aren’t even updating the same row can block each other.

Also, a lack of resources, not enough memory, CPU, or fast enough disks can lead to additional blocking. As processes wait for the resources to be freed up, they’re holding locks on rows or pages needed by other processes, all making blocking worse. We’ll discuss blocking in Chapter 17.

Deadlocks

Deadlocks are related to blocking; in fact, they’re caused by blocking. However, blocking and deadlocks are two, very different, topics and should be kept separate in your head. The core issue that causes a deadlock is when two processes each have an exclusive lock on a page that the other needs in order to complete a transaction. This is often referred to as a deadly embrace. One process must be given the go ahead to complete. The other process is chosen as a deadlock victim, and its pending changes are rolled back.

The rollback is the real performance headache. Not only did a query not complete, but now SQL Server has to do work to remove the partially complete changes from the system. That can lead to more resource contention, more slowdowns, and additional blocking.

At the root of the problem, deadlocks themselves are a performance-related problem. If all your queries complete fast enough, the chances of a deadlock are very slim. We’ll discuss deadlocks in Chapter 18.

Non-Set-Based Operations

SQL in general and T-SQL in particular are designed to work with data in sets. Unlike many programming languages, you shouldn’t be approaching your queries thinking about processing information a single row at a time. People use cursors and other types of loop operations to force a row-by-row style processing on SQL Server. It frankly destroys performance. I address this in detail in Chapter 19.

Incorrect Database Design

SQL Server is a relational data storage engine. Of the many things this means, one of them is that SQL Server is designed to work with tables that use foreign key constraints as a mechanism to enforce the relationship between the tables. The methodology for creating these types of databases is called normalization. Ensuring that your database is properly normalized enhances performance. You reduce the number of columns by eliminating duplicate values. You reduce the number of rows by storing information in multiple tables. All this enhances performance when done correctly.

You can also use different modeling techniques with relational databases like SQL Server. An example might be a star schema for a data warehouse. These also suffer from incorrect design that can directly hurt performance.

A fundamental part of database design is also indexing. Clustered indexes and clustered Columnstore indexes define data storage, which also defines data retrieval. The correct index and the right keys on that index (if it’s a rowstore index) are as much a part of the database design as the table definitions.

Finally, storing data correctly, meaning a datetime goes into a datetime column and a string goes into a char, varchar, or nvarchar column, positively affects performance. Putting the wrong data types into the database can actually hurt performance since you’ll have to work around the fact that since you don’t have dates in your date column, but strings, you’re giving up the possibility of date math functionality within SQL Server. This can also negatively impact index design and use.

We’re going to discuss aspects of database design throughout the book, but this is a huge topic. I strongly recommend getting a copy of Louis Davidson’s book: Pro SQL Server Relational Database Design and Implementation: Best Practices for Scalability and Performance (Apress, 2020).

Poor Execution Plan Reuse

The compilation of an execution plan by the query optimizer is an expensive operation. Because of this, SQL Server stores execution plans in a memory space referred to as the plan cache. The idea is simple. Many queries, maybe even most queries, are repetitious. Because of this, plans can be reused, tremendously reducing the overhead required to generate them. Parameterization of stored procedures and prepared statements makes plan reuse simpler. So does a process called simple parameterization where the optimizer adds parameters to simple queries. All of this is an attempt to reuse execution plans.

It is possible though that queries are structured such that they don’t get to reuse plans, even though they are the same as queries run earlier. Generating dynamic T-SQL can lead to this problem. Badly configured ORM tools can generate inappropriate parameters, preventing plan reuse. All of that adds up to additional overhead on the system and slower performance. Chapter 6 covers this information.

Frequent Recompilation of Queries

Just as a lack of plan reuse causes excessive overhead on the system, a lot of recompilation of execution plans leads to a very similar problem. Generally speaking, recompiling of execution plans is a desirable process. The recompile is generally caused by changes in data, which leads to statistics getting updated. With new information, the optimizer might make better choices for your queries.

However, as with many other good things, too much is a problem. You may have highly volatile data sets that lead to a lot of recompiles. You can also get recompiles from poor coding practices. We’ll discuss how to avoid recompiles in Chapter 14.

Summary

This chapter introduced us to the concepts that we’ll be discussing throughout the rest of the book. The iterative nature of query tuning assumes that you’ll be doing it quite a bit more than once over the lifetime of a given application and database. The core concepts though are always the same. First, measure performance on your system so you have something to compare against. Use those measurements to determine which queries could use help. Investigate and test possible solutions, validating any possible performance improvements against your metrics. Then, do the whole thing over again as the system changes and data changes.

The next chapter is going to cover how the query optimizer works. We’ll also introduce execution plans, our window into the work done by the query optimizer. Since all of query performance is based on what the optimizer is doing, it makes a great place to get started understanding the mechanisms you can apply to make your systems run faster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.158.32