Recreating rows from columnar storage

Of course, there is still the question of how to recreate rows from columnar storage. In 1999, Stephen Tarin patented the Tarin Transform Method, which uses a row reconstruction table to regenerate the rows. Mr. Tarin called columnar storage the Trans-Relational Model. This does not mean the model is beyond relational; this was more a marketing term, short for Transform Relational Model.

SQL Server documentation does not publish the row recreation algorithm it uses. I am presenting the Tarin's method here. It should still be good enough to give you a better understanding of the amount of work SQL Server has to do when it recreates rows from columns.

The following figure explains the Tarin Transform Method:

Tarin Transform Method

In the figure, the top table is the row reconstruction table. You start reconstructing the rows in the top left corner. For the first column of the first row value, take the top left value in the columnar storage table at the bottom, in this case the value Bolt. In the first cell of the row reconstruction table, there is a pointer to the row number of the second column in this table. In addition, it is an index for the value of the second column in the columnar storage table. In the figure, this value is 3. This means that you need to find the value in the second column of the columnar storage table with index 3, which is Green. In the row reconstruction table, you read the value 5 in the second column, third row. You use this value as an index for the value of the third column in the columnar storage table, which, in the example, is Paris. In the row reconstruction table, you read the value 1. Because this is the last column in the table, this value is used for a cyclic redundancy check, checking whether you can correctly get to the starting point of the row reconstruction process. The row Bolt, Green, and Paris, the second row from the original table from the previous figure, was successfully reconstructed.

As mentioned, how SQL Server reconstructs the rows has not been published. Nevertheless, you can appreciate that this is quite an intensive process. You can also imagine changes in the original data. Just a small update to the original values might cause the complete recreation of the row reconstruction table. This would simply be too expensive. This is why columnar storage, once it is created, is actually read-only. In SQL Server 2014 and 2016, columnstore indexes are updateable; however, SQL Server does not update the columnar storage online. SQL Server uses additional row storage for the updates. You will learn further details later in this chapter.

Table of Contents for Recreating rows from columnar storage

Create new playlist

Sign In

Sign Up

Table of Contents for
Recreating rows from columnar storage