Development of columnar storage in SQL Server

SQL Server introduced columnar storage in version 2012. Only NCCIs were supported. This means that you still need to have the original row storage, either organized as a heap or as a clustered index (CI). There are many other limitations, including the following:

  • Nonclustered columnstore index only
  • One per table
  • Must be partition-aligned
  • Table becomes read-only (partition switching allowed)
  • Unsupported types
    • Decimals greater than 18 digits
    • Binary, Image, CLR (including Spatial, HierarchyId)
    • (N)varchar(max), XML, Text, Ntext
    • Uniqueidentifier, Rowversion, SQL_Variant
    • Date/time types greater than 8 bytes

SQL Server 2014 introduced clustered columnstore indexes (CCI). This means that the original row storage does not exist anymore; CCI is the only storage you need. Just like in a regular clustered index, SQL Server needs to identify each row in a clustered columnstore index as well. Note that SQL Server 2014 does not support constraints on the columnar storage. Therefore, SQL Server 2014 adds a bookmark, which is a unique tuple ID inside a rowgroup, stored as a simple sequence number. SQL Server 2014 has many data types unsupported for columnar storage, including the following:

  • Varbinary(MAX), Image, CLR (including Spatial, HierarchyId)
  • (N)Varchar(max), XML, Text, Ntext
  • Rowversion, SQL_Variant

SQL Server 2014 also optimizes the columnstore index build. For example, SQL Server 2012 used a fixed number of threads to build the index. This number was estimated in advance. If, for some reason, the operating system took some threads away from SQL Server while SQL Server was building a columnstore index, the build might have failed. In SQL Server 2014, the degree of parallelism or the number of threads can be adjusted dynamically while SQL Server builds the columnstore index.

The CCI in SQL 2014 is updateable. However, its columnar storage is immutable. The following figure explains the update process:

How SQL Server updates columnar storage

Data modification is implemented as follows:

  • Insert: The new rows are inserted into a delta store
  • Delete: If the row to be deleted is in a column store row group, a record containing its row ID is inserted into the B-tree storing the delete bitmap; if it is in a delta store, the row is simply deleted
  • Update: Split into a delete and an insert
  • Merge: Split into a delete, an insert, and an update

A delta store can be either open or closed. When a new delta store is created, it is open. After you insert the maximum number of rows for a rowgroup in an open delta store, SQL Server changes the status of this delta store to closed. If you remember, this means a bit more than one million rows. Then a background process called tuple-mover converts the closed delta stores to column segments. This process starts by default every five minutes. You can run it manually with the ALTER INDEX ... REORGANIZE or ALTER INDEX ... REBUILD commands.

Non-bulk (trickle) inserts go to an open delta store. Bulk inserts up to 102,400 rows; smaller ones go to an open delta store, and larger ones go directly to column segments. More delta stores mean less compression. Therefore, when using bulk insert, you should try to optimize the batches to contain close to 1,000,000 rows. You can also rebuild the index occasionally.

SQL Server 2016 brings many additional features to the columnstore indexes. The most important features in version 2016 include the following:

  • CCI supports additional NCI (B-tree) indexes
  • CCI supports through NCI primary and foreign key constraints
  • CCI supports snapshot and read-committed snapshot isolation levels
  • NCCI on a heap or B-tree updateable and filtered
  • Columnstore indices on in-memory tables
  • Defined when you create the table in the CREATE TABLE statement
  • Must include all columns and all rows (not filtered)
  • NCI indexes can be filtered

SQL Server 2017 adds only small improvements to columnar storage, compared to version 2016:

  • You can build and rebuild NCCIs online
  • CCIs now support LOB columns: VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.22.145