Batch processing

With columnar storage, the CPU can become a bottleneck. SQL Server solves these problems with batch mode processing. In batch mode processing, SQL Server processes data in batches rather than processing one row at a time. A batch represents roughly 900 rows of data. Each column within a batch is stored as a vector in a separate memory area, meaning that batch mode processing is vector-based. Batch mode processing interrupts a processor with metadata only once per batch rather than once per row, as in row mode processing, which lowers the CPU burden substantially. This means that batch mode spreads the metadata access costs over all of the 900 rows in a batch.

Batch mode processing is orthogonal to columnar storage. This means SQL Server can use it with many different operators, no matter whether the data is stored in row or column storage. However, batch mode processing gives the best results when combined with columnar storage. DML operations, such as insert, update, or delete, work in row mode. Of course, SQL Server can mix batch and row mode operators in a single query.

SQL Server introduced batch mode also in version 2012. The batch mode operators in this version include the following:

  • Filter
  • Project
  • Scan
  • Local hash (partial) aggregation
  • Hash inner join
  • Batch hash table build, but only in-memory, no spilling
  • Bitmap filters limited to a single column, data types represented with a 64-bit integer

In SQL Server 2014, the following batch mode operators were added:

  • All join types
  • Union all
  • Scalar aggregation
  • Spilling support
  • Complex bitmap filters, all data types supported

SQL Server 2016 added the following improvements to batch mode processing:

  • Single-threaded queries
  • Sort operator
  • Multiple distinct count operations
  • Left anti-semi join operators
  • Window aggregate functions
  • Window analytical functions
  • String predicate and aggregate pushdown to the storage engine
  • Row-level locking on index seeks against a nonclustered index and rowgroup-level locking on full table scans against the columnstore

Finally, SQL Server 2017 added two more improvements to batch mode processing:

  • Batch mode adaptive joins
  • Batch mode memory grant feedback

The following table summarizes the most important features and limitations of columnar storage and batch mode processing in SQL Server versions 2012 to 2106:

Columnstore Index/Batch Mode Feature

SQL 2012

SQL 2014

SQL 2016

SQL 2017

Batch execution for multi-threaded queries

yes

yes

yes

yes

Batch execution for single-threaded queries

yes

yes

Batch mode adaptive joins

yes

Batch mode memory grant feedback

yes

Archival compression

yes

yes

yes

Snapshot isolation and read-committed snapshot isolation

yes

yes

Specify CI when creating a table

yes

yes

AlwaysOn supports CIs

yes

yes

yes

yes

AlwaysOn readable secondary supports read-only NCCI

yes

yes

yes

yes

AlwaysOn readable secondary supports updateable CIs

yes

yes

Read-only NCCI on heap or B-tree

yes

yes

yes

yes

Updateable NCCI on heap or B-tree

yes

yes

NCCI online build and rebuild

yes

B-tree indexes allowed on a table that has a NCCI

yes

yes

yes

yes

Updateable CCI

yes

yes

yes

CCI LOB columns support

yes

B-tree index on a CCI

yes

yes

CCI on a memory-optimized table

yes

yes

Filtered NCCI

yes

yes

 

You can check whether SQL Server uses row or batch mode for an operator by analyzing the properties of the operator in the execution plan. Before checking the batch mode, the following code adds nine time as many rows to the test fact table:

DECLARE @i AS INT = 1; 
WHILE @i < 10 
BEGIN 
SET @i += 1; 
INSERT INTO dbo.FactTest 
(SaleKey, CustomerKey,  
 Customer, CityKey, City, 
 DateKey, StockItemKey, 
 Product, Quantity, 
 TotalAmount, Profit) 
SELECT @i * 1000000 + f.[Sale Key] AS SaleKey, 
  cu.[Customer Key] AS CustomerKey, cu.Customer, 
  ci.[City Key] AS CityKey, ci.City,  
  f.[Delivery Date Key] AS DateKey, 
  s.[Stock Item Key] AS StockItemKey, s.[Stock Item] AS Product,  
  f.Quantity, f.[Total Excluding Tax] AS TotalAmount, f.Profit 
FROM Fact.Sale AS f 
  INNER JOIN Dimension.Customer AS cu 
    ON f.[Customer Key] = cu.[Customer Key] 
  INNER JOIN Dimension.City AS ci 
    ON f.[City Key] = ci.[City Key] 
  INNER JOIN Dimension.[Stock Item] AS s 
    ON f.[Stock Item Key] = s.[Stock Item Key] 
  INNER JOIN Dimension.Date AS d 
    ON f.[Delivery Date Key] = d.Date; 
END; 

Let's check how much space this table uses now:

EXEC sys.sp_spaceused N'dbo.FactTest', @updateusage = N'TRUE'; 
GO 

The result is as follows:

Name          rows     reserved   data       index_size  unused 
------------  -------  ---------  ---------  ----------  -------
dbo.FactTest  2279810  502152 KB  498528 KB  2072 KB     1552 KB

Now let's start querying this table. Before executing the following query, make sure you turn the actual execution plan on:

SELECT f.StockItemKey, 
 SUM(f.TotalAmount) AS Sales 
FROM dbo.FactTest AS f 
WHERE f.StockItemKey < 30 
GROUP BY f.StockItemKey 
ORDER BY f.StockItemKey; 

You can hover the mouse over any of the operators. For example, the following screenshot shows the details of the Hash Match (Partial Aggregate) operator:

Row mode operators

You can see that SQL Server used row mode processing. As mentioned, batch mode is not strictly bound to columnar storage; however, it is much more likely that SQL Server would use it as you use the columnar storage. The following code creates a filtered nonclustered columnstore index. It is actually empty:

CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_FactTest 
ON dbo.FactTest 
(SaleKey, CustomerKey,  
 Customer, CityKey, City, 
 DateKey, StockItemKey, 
 Product, Quantity, 
 TotalAmount, Profit) 
WHERE SaleKey = 0; 
GO 

Now, execute the same query again. As you can see from the following screenshot, this time  batch mode is used for the  Hash Match (Partial Aggregate) operator:

Batch mode operators
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.182.66