CHAPTER 5

image

Nonclustered Indexes

This chapter discusses nonclustered indexes, which is the second type of indexes supported by the In-Memory OLTP Engine. It shows how to define nonclustered indexes, talks about their SARGability rules, and explains their internal structure.

Working with Nonclustered Indexes

Nonclustered indexes are another type of indexes supported by the In-Memory OLTP Engine. In contrast to hash indexes, which are optimized to support point-lookup equality searches, nonclustered indexes help you search data based on a range of values. They have a somewhat similar structure to regular indexes on on-disk tables. They are not exactly the same, however, and I will discuss their internal implementation in depth later in this chapter.

TERMINOLOGY ISSUE

Nonclustered indexes were introduced in SQL Server 2014 CTP 2, and the documentation and whitepapers for that version used the term “range indexes” to reference them. However, in the production release of SQL Server 2014, Microsoft changed the terminology to “nonclustered indexes.”

That terminology can be confusing because hash indexes are also not clustered. In fact, the concepts of heaps and clustered indexes cannot be applied to In-Memory OLTP. Data rows are not stored in any particular order nor are they grouped together on the data pages in memory.

It is also worth mentioning that the minimal index_id value of In-Memory OLTP indexes is 2, which corresponds to nonclustered indexes in on-disk tables.

Creating Nonclustered Indexes

Nonclustered indexes are created inline as part of the CREATE TABLE statement. The syntax is similar to hash index creation; however, you should omit the keyword HASH and you do not need to specify the number of buckets in the index properties. Collation requirement still exists; you cannot index text data unless a column uses BIN2 binary collation.

The code in Listing 5-1 creates a memory-optimized table with two nonclustered indexes, one composite and another on the single column.

Using Nonclustered Indexes

Similar to B-Tree indexes in on-disk tables, the data in nonclustered indexes is sorted accordingly to the value of index key columns. As result, nonclustered indexes are beneficial in a large number of use cases. They can lead to an Index Seek operation in scenarios when query predicates allow SQL Server to locate and isolate a subset of the index keys for processing. With very few exceptions, the SARGability rules for nonclustered indexes match the rules for indexes defined on on-disk tables.

Listing 5-2 shows several queries against the dbo.Customers table. SQL Server is able to use Index Seek with all of them.

Similar to B-Tree indexes, Index Seek is impossible when query predicates do not allow isolating a subset of the index keys for processing. Listing 5-3 shows several examples of such queries.

As the opposite of B-Tree indexes on on-disk tables, nonclustered indexes are unidirectional, and SQL Server is unable to scan index keys in the order opposite of how they were sorted. You should keep this behavior in mind when you define an index and choose the sorting order for the columns.

Let’s illustrate that with an example; we’ll create an on-disk table with the same structure as dbo.Customers, and populate both tables with the same data. Listing 5-4 shows the code to do so.

Let’s run the queries that select several rows in ascending order, which match the index sorting order. The queries are shown in Listing 5-5.

Figure 5-1 shows the execution plans for the queries. SQL Server scans the indexes starting with the lowest key and stops after it read three rows. The execution plans are similar for both queries with the exception of required Key Lookup with on-disk data. SQL Server uses it to obtain the values of the FirstName and LastName columns from the clustered index of the table.

Key Lookup is not required with memory-optimized tables where the index pointers are part of the actual data rows and the indexes are covering the queries.

9781484211373_Fig05-01.jpg

Figure 5-1. Execution plans when the order by results match the index sorting order

The situation changes if you need to sort the output in the descending order, as shown in Listing 5-6.

As you can see in Figure 5-2, SQL Server is able to scan the on-disk table index in the order opposite of how it was defined. However, this is not the case for memory-optimized tables where indexes are unidirectional. SQL Server decides to scan the primary key and sort the data afterwards.

9781484211373_Fig05-02.jpg

Figure 5-2. Execution plans when the order by results are the opposite of the index sorting order

Finally, index statistics limitations, which were discussed in Chapter 4, still apply to the nonclustered indexes. SQL Server creates statistics at the time of index creation; however, they are never updated automatically. You should update statistics manually on a regular basis.

Nonclustered Indexes Internals

Nonclustered indexes use a lock- and latch-free variation of B-Tree, called Bw-Tree, which was designed by Microsoft Research in 2011. Let’s look at the Bw-Tree structure in detail.

Bw-Tree Overview

Similar to B-Trees, index pages in a Bw-Tree contain a set of ordered index key values. However, Bw-Tree pages do not have a fixed size and they are unchangeable after they are built. The maximum page size, however, is 8KB.

Rows from a leaf level of the nonclustered index contain the pointers to the data row chains with the same index key values. This works in a similar manner to hash indexes, when multiple rows and/or versions of a row are linked together. Each index in the table adds a pointer to the index pointer array in the row, regardless of its type: hash or nonclustered.

Root and intermediate levels in nonclustered indexes are called internal pages. Similar to B-Tree indexes, internal pages point to the next level in the index. However, instead of pointing to the actual data page, internal pages use a logical page id (PID), which is a position (offset) in a separate array-like structure called a mapping table. In turn, each element in the mapping table contains a pointer to the actual index page.

Figure 5-3 shows an example of a nonclustered index and a mapping table. Each index row from the internal page stores the highest key value on the next-level page and PID. This is different from a B-Tree index, where intermediate- and root-level index rows store the lowest key value of the next-level page instead. Another difference is that the pages in a Bw-Tree are not linked in a double-linked list. Each page knows the PID of the next page on the same level and does not know PID of the previous page. Even though it appears as a pointer (arrow) in Figure 5-3, that link is done through the mapping table, similar to links to pages on the next level.

9781484211373_Fig05-03.jpg

Figure 5-3. Nonclustered index structure

Even though a Bw-Tree looks very similar to a B-Tree, there is one conceptual difference: the leaf level of an on-disk B-Tree index consists of separate index rows for each data row in the index. If multiple data rows have the same key value, the index would have multiple leaf level rows with the same index key stored.

Alternatively, in-memory nonclustered indexes store one index row (pointer) to the row chain that includes all of the data rows that have the same key value. Only one index row (pointer) per key value is stored in the index. You can see this in Figure 5-3, where the leaf level of the index has single rows for the key values of Ann and Nancy, even though the row chain includes more than one data row for each value.

Image Tip  You can compare the structure of B-Tree and Bw-Tree indexes by looking at Figures 3-1 and 3-2 from Chapter 3, which show clustered and nonclustered B-Tree indexes on on-disk tables.

Index Pages and Delta Records

As mentioned, pages in nonclustered indexes are unchangeable once they are built. SQL Server builds a new version of the page when it needs to be updated and replaces the page pointer in the mapping table, which avoids changing internal pages that reference an old (obsolete) page.

Every time SQL Server needs to change a leaf-level index page it creates one or two delta records that represent the changes. INSERT and DELETE operations generate a single insert or delete delta record, while an UPDATE operation generates two delta records, deleting old and inserting new values. Delta records create a chain of memory pointers with the last pointer to the actual index page. SQL Server also replaces a pointer in the mapping table with the address of the first delta record in the chain.

Figure 5-4 shows an example of a leaf-level page and delta records if the following actions occurred in the sequence: R1 index row is updated, R2 row is deleted, and R3 row is inserted.

9781484211373_Fig05-04.jpg

Figure 5-4. Delta records and nonclustered index leaf page

Image Note  The internal implementation of the In-Memory OLTP Engine guarantees that multiple sessions cannot simultaneously update memory pointers in the various In-Memory OLTP objects, thereby overwriting each other’s changes. This process is covered in detail in Appendix A.

The internal and leaf pages of nonclustered indexes consist of two areas: a header and data. The header area includes information about the page such as the following:

  • PID: The position (offset) in the mapping table
  • Page type: The type of the page, such as leaf, internal, delta, or special
  • Right page PID: The position (offset) of the next page in the mapping table
  • Height: The number of levels from the current page to the leaf level of the index
  • The number of key values (index rows) stored on the page
  • Delta records statistics: Includes the number of delta records and space used by the delta key values
  • The max value of a key on the page

The data area of the page includes either two or three arrays depending on the index keys data types. The arrays are

  • Values: An array of 8-byte pointers. Internal pages in the index store the PID of next-level pages. Leaf-level pages store pointers to the first row in the chain of rows with the corresponding key value. It is worth noting that even though PID requires 4 bytes to store a value, SQL Server uses 8-byte elements to preserve the same page structure between internal and leaf pages.
  • Keys: An array of key values stored on the page.
  • Offsets: An array of two-byte offsets where individual key values in keys array start. Offsets are stored only if keys have variable-length data.

Delta records, in a nutshell, are one-record index data pages. The structure of delta data pages is similar to the structure of internal and leaf pages. However, instead of arrays of values and keys, delta data pages store operation code (insert or delete) and a single key value and pointer to the first data row in a row chain.

Figure 5-5 shows an example of a leaf-level index page with an insert delta record.

9781484211373_Fig05-05.jpg

Figure 5-5. A leaf-level index page with an insert delta record

SQL Server needs to traverse and analyze all delta records when accessing an index page. As you can guess, a long chain of delta records affects performance. When this is the case, SQL Server consolidates delta records and rebuilds an index page, creating a new one. The newly created page has the same PID and replaces the old page, which is marked for garbage collection. Replacement of the page is accomplished by changing a pointer in the mapping table. SQL Server does not need to change internal pages because they use the mapping table to reference leaf-level pages.

The process of rebuilding is triggered at the moment a new delta record is created for pages that already have 16 delta records in a chain. The action described by the delta record, which triggers the rebuild, is incorporated into the newly created page.

Two other processes can create new or delete existing index pages, in addition to delta records consolidation. The first process, page splitting, occurs when a page does not have enough free space to accommodate a new data row. Another process, page merging, occurs when a delete operation leaves an index page less than 10% from the maximum page size, which is 8KB now, or when an index page contains just a single row.

Image Note  The page splitting and page merging processes are covered in depth in Appendix B.

Obtaining Information About Nonclustered Indexes

In addition to the sys.dm_db_xtp_hash_index_stats view, which was discussed in Chapter 4, SQL Server provides two other views to obtain information about indexes on memory-optimized tables. Those views provide the data collected since the time when memory-optimized tables were loaded into memory, which occurs at database startup.

You can obtain information about index access methods and ghost rows in both hash and nonclustered indexes with the sys.dm_db_xtp_index_stats view. The notable columns in the view are the following:

  • scans_started shows the number of times that row chains in the index were scanned. Due to the nature of the index, every operation, such as SELECT, INSERT, UPDATE, and DELETE, requires SQL Server to scan a row chain and increment this column.
  • rows_returned represents the cumulative number of rows returned to the next operator in the execution plan. It does not necessarily match the number of rows returned to a client because further operators in the execution plan can change it.
  • rows_touched represents the cumulative number of rows accessed in the index.
  • rows_expired shows the number of detected stale rows. I will discuss this in greater detail when I talk about the garbage collection process in Chapter 9.
  • rows_expired_removed returns the number of stale rows that have been unlinked from the index row chains. I will also discuss this in more detail when I talk about garbage collection.

Listing 5-7 shows the query that returns the information about indexes defined on the dbo.Customers table.

Figure 5-6 illustrates the output of the query. Large number of Rows Per Scan can indicates heavy index scans, which can be the sign of a suboptimal indexing strategy and/or poorly written queries.

9781484211373_Fig05-06.jpg

Figure 5-6. Output from the sys.dm_db_xtp_index_stats view

Image Note  You can read more about the sys.dm_db_xtp_index_stats view at http://msdn.microsoft.com/en-us/library/dn133081.aspx.

The sys.dm_db_xtp_nonclustered_index_stats view returns information about nonclustered indexes. It includes information about the total number of pages in the index plus page splits, merges, and consolidation-related statistics.

Listing 5-8 shows information about nonclustered indexes defined on the dbo.Customers table. Figure 5-7 shows the output of the query.

9781484211373_Fig05-07.jpg

Figure 5-7. Output from the sys.dm_db_xtp_nonclustered_index_stats view

Image Note  You can read more about the sys.dm_db_xtp_nonclustered_index_stats view at https://msdn.microsoft.com/en-us/library/dn645468.aspx.

Hash Indexes vs. Nonclustered Indexes

As you already know, hash indexes are useful only for point-lookup searches in cases when queries use equality predicates on all index columns. Nonclustered indexes, on the other hand, can be used in a much wider scope, which often makes the choice obvious. You should use nonclustered indexes when your queries benefit from scenarios other than point-lookups.

The situation is less obvious in the case of point-lookups. With the hash indexes, SQL Server can locate the hash bucket, which is the entry point to the data row chain, in a single step by calling the hash function and calculating the hash value. With nonclustered indexes, SQL Server has to traverse Bw-Tree to find a leaf page, and the number of steps depends on the height of the index and number of delta records there.

Even though nonclustered indexes require more steps to find an entry point to the data row chain, the chain can be smaller compared to hash indexes. Row chains in nonclustered indexes are built based on unique index key values. In hash indexes, row chains are built based on a non-unique hash key and can be larger due to hash collisions, especially when the bucket_count is insufficient.

Let’s compare hash and nonclustered index performance in a point-lookup scenario. Listing 5-9 creates four tables of the same structure. Three of them have hash indexes defined on the Value column using a different bucket_count. The fourth table has a nonclustered index defined on the same column instead. Finally, the code populates all tables with the same data.

Different numbers of buckets led to the different index row chain sizes in the indexes. In this case, the Hash_131072, Hash_16384, and Hash_1024 tables have on average 1, 4, and 73 rows per chain, respectively.

Image Tip  You can analyze hash index properties using the sys.dm_db_xtp_hash_index_stats view and the code from Listing 4-2 from Chapter 4.

As the next step, let’s compare point-lookup performance using the code from Listing 5-10. This code triggers 65,536 point lookup selects against each table.

Table 5-1 shows the execution time of the queries in my environment. As you can see, the nonclustered index point-lookup select is slightly slower compared to hash indexes with relatively short row chains; however, it is faster in the case of the long row chains and incorrect bucket count estimations.

Table 5-1. Execution Time of Queries

Table5-1

Memory requirements are another factor to consider. With the hash indexes, memory usage depends on the number of buckets. The amount of memory required for the nonclustered indexes depends on the size of the index key and index cardinality (uniqueness of index key values). For example, if a table has a varchar column with 1,000,000 unique values of 100 bytes each, the nonclustered index on that column would require about 800MB to support a Bw-Tree structure. Alternatively, a hash index with 2,097,152 buckets will use just 16MB of memory.

With all that being said, hash indexes are a good choice only in cases where the workload and data are relatively static and, therefore, you can correctly estimate bucket_count and you do not expect anything other than point-lookup queries in the future. In all other cases, nonclustered indexes are the safer choice.

Summary

Nonclustered indexes are the second type of indexes supported by the In-Memory OLTP Engine. They have similar SARGability rules with the B-Tree indexes defined on on-disk tables with the exception of the scans in the order opposite of the index key sorting order.

Internally, nonclustered indexes use a lock- and latch-free variation of B-Tree, called Bw-Tree, which consists of internal and leaf data pages referencing each other through the mapping table. Leaf data pages store one row per each individual key value with the pointer to the chain of the data rows with the same key.

SQL Server never updates index pages. Any changes are referenced through the delta records that correspond to individual INSERT and DELETE operations on the page. SQL Server consolidates the large chains of delta records and performs splitting and merging of the data pages when needed. All of those processes create the new data pages, marking the old ones for garbage collection.

Nonclustered indexes are a good choice in scenarios when point-lookup is not an option and/or when it is hard to estimate the number of buckets in the hash index.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.59.219