Processing of rows in SQL Server is traditionally managed one row at a time. For transactional workloads, this is a sensible convention, as row counts for read and write operations are typically small.
Analytic queries that routinely operate on thousands or millions of rows do not benefit from reading rows in this fashion. Batch mode execution is a SQL Server execution mode that allows groups of rows to be read and passed between execution plan operators, ultimately improving performance.
As a query is executed and each operator is processed, rows are passed between those operators in quantities determined by the execution mode. In row mode, each row is passed from operator to operator sequentially. In batch mode, groups of rows are passed between operators. The result of this convention is that in batch mode, control is passed between operators less often as rows can be handed off in fewer batches.
Row Mode Execution
Queries with a low estimated cardinality will typically use row mode execution. In row mode execution, rows are read one by one as they are passed through each operator in an execution plan. This sounds inefficient, but it is important to note that processing rows in batches incurs overhead to do so. Therefore, SQL Server will make decisions at runtime to determine if a query would be more efficiently processed in row mode or batch mode.
Transactional Query Returning a Single Row
Note the execution modes provided in the query operator details. Row mode is indicated for both the estimated and actual execution modes. Given that the underlying table is a rowstore table and that the query only returns a single row, execution via row mode is expected for this operation.
While columnstore indexes are optimized for analytic queries that operate against large row counts and will usually take advantage of batch mode processing, row mode can be chosen as the execution mode if the optimizer believes that is the most efficient option.
Query That Uses Row Mode Execution Against a Columnstore Index
While the execution plan operator indicates it is using columnstore storage, the optimizer chooses row mode as the execution mode. This should not be seen as unusual or suboptimal. Because the query only returns a single row, the use of row mode is optimized for that expected outcome, even if the table is stored in a clustered columnstore index.
Batch Mode Execution
Queries that have a high cardinality can benefit from batch mode execution. In this execution mode, rows are divided into batches and processed in those batches for each operator in an execution plan where batch mode is chosen. In columnstore indexes, this will be the expected execution mode. For a large analytic query to not use batch mode would be considered an outlier. If that outlier was identified as a poorly performing query, then further investigation as to why batch mode was not used would be worthwhile.
Query That Uses Batch Mode Execution Against a Columnstore Index
Whether batch mode is chosen by the query optimizer for a given plan operator depends on the number of rows processed by that operator and not on the number of rows returned by a query.
Execution modes are not all-or-nothing choices for a query. The query optimizer can choose batch mode for some operators and row mode for others and will do so based on whatever mode it determines is most efficient for each.
Query That Joins a Large Analytic Table with a Small Lookup Table
Query to Alter the Database Compatibility Level to SQL Server 2019
With SQL Server 2019 features available, batch mode is chosen as the execution mode for the rowstore clustered index scan. For a query scanning many rows, this makes perfect sense.
Starting in SQL Server 2019, batch mode can be used by SQL Server against rowstore tables, but will be limited to scenarios when row counts are higher and using batch mode is expected to improve performance. Prior to SQL Server 2019, batch mode was only available for columnstore indexes. This is a significant improvement as it can greatly improve analytic query performance against rowstore tables.
This leads to an immediate question: If batch mode on rowstore tables can improve analytic performance, are columnstore indexes still useful? If batch mode were the only feature that made columnstore indexes highly performant, then that would be a valid question. Batch mode on rowstore tables does not provide the other benefits of columnstore indexes, such as columnstore compression, segment elimination, or actionable metadata. For tables that service primarily analytic workloads, clustered columnstore indexes will be the correct solution. For tables that service mixed OLAP and OLTP workloads, batch mode on rowstore will improve performance and potentially remove the need to create nonclustered columnstore indexes, assuming OLAP queries are isolated and do not scan too many rows at once. Ultimately, testing is required to determine the optimal indexing solution for a given application, but batch mode on rowstore is a helpful tool that will aid in this decision-making process. For more information on mixing rowstore and columnstore indexes, see Chapters 12 and 13.
How Does Batch Mode Work?
The goal of batch mode processing is to improve the throughput of data from its storage location to a SQL Server’s CPUs. To accomplish this, SQL Server fundamentally changes how data flows through execution plans from start to finish.
In the execution plan, a clustered index scan is retrieving 276 rows that are passed into an aggregation that results in a single row that is in turn returned by the query. In row mode, each of the 276 rows is individually transferred from the clustered index scan operator to the stream aggregate operator.
A row batch looks somewhat similar to the structure of a columnstore index and provides similar benefits in memory as a query is executed. The qualifying rows vector performs a similar function to the columnstore index’s delete bitmap in that it flags rows that are no longer logically needed by the batch. For example, if a filter is applied to a data set, it can be processed solely by updating the qualifying rows vector for the rows that are to be filtered out.
Batch mode processing reduces CPU consumption by decreasing the number of times that data is transferred between execution plan operators. For example, 5000 rows processed via row mode processing will require at least 5000 CPU operations to move it between plan operators. The same data set processed via batch mode might assign 500 rows each to ten batches. In this scenario, data can be passed between execution plan operators using 5000/500 = 10 operations. These are approximations, but are illustrative for how batch mode and row mode processing affect CPU consumption.
Another boon to batch mode processing is the ease in which it can take advantage of parallelism. Batches can be processed in parallel, allowing a multicore CPU architecture with available CPU capacity to process query plan operators more efficiently. Parallelism is a process that requires some CPU overhead to utilize. Therefore, the amount of rows involved in parallelism needs to be significant enough that the benefits of breaking a workload into smaller chunks outweigh the cost to do so. Row mode does not as easily benefit from parallelism in this regard as the effort to split single row operations into separate parallel operations and then combine them back together is far greater than the trivial savings afforded by that process.
While batch mode may not always be chosen for rowstore tables, it will be the default choice for columnstore indexes. Since a rowgroup will contain up to 220 rows, the volume of data that needs to be processed by a columnstore index scan will be large enough to ensure that SQL Server chooses batch mode as the likely candidate to provide the best performance for queries against that data.
Batch Mode vs. Row Mode Performance
While columnstore indexes will most often default to using batch mode processing, rowstore tables will have a choice as to which mode to utilize. That decision will be based on the cardinality of the query, the amount of data that needs to be read, and the version of SQL Server involved.
There is value in directly comparing the performance of a query when it uses row mode vs. using batch mode and measuring the performance gains provided by batch mode.
Analytic Query Executed in Two Compatibility Levels
Batch mode was used successfully to process 228265 rows in the second plan.
CPU consumption using batch mode is nearly ten times less than using row mode.
Actual Number of Locally Aggregated Rows documents aggregate pushdown.
Making use of batch mode allows SQL Server to also take advantage of aggregate pushdown. The combination of these features was what allowed for CPU to be so greatly reduced.
Adaptive joins (SQL Server 2017)
Memory grant feedback (SQL Server 2017)
Aggregate pushdown (SQL Server 2016)
Generally speaking, when SQL Server chooses to use batch mode, performance will be improved with its usage. Similarly, when additional intelligent query processing features are used (such as adaptive joins or aggregate pushdown), they will also have a positive impact on performance.
Query That Disables Optimization Features (for Testing Purposes Only)
Note that these features should only be disabled for testing purposes only and should not be turned off in a production environment unless there are exceptional and well-documented reasons to do so.
Compatibility levels may also be adjusted for testing and demonstration purposes. This can help model the impact of an upgrade or to allow a SQL Server upgrade to functionally occur at a slower pace over time. By incrementally adjusting the compatibility level up by one level at a time from the original SQL Server version to the new version, risk can be isolated and mitigated in steps, rather than all at once. This process also provides a rollback mechanism, if needed, since compatibility modes can be lowered, in the event of an emergency.