CHAPTER 9

image

Garbage Collection

This chapter covers the garbage collection process used in the In-Memory OLTP Engine. It provides an overview of the various components involved in garbage collection and demonstrates how they interact with each other.

Garbage Collection Process Overview

In-memory OLTP is a row-versioning system. UPDATE operations generate new versions of rows rather than updating row data. DELETE operations do not remove the rows but rather update the EndTs row timestamp. Rows created by aborted transactions are not deallocated immediately and they stay as part of the index row chains even after rollback.

As you know, every row has two timestamps (BeginTs and EndTs) that indicate row lifetime: when the row was created and when it was deleted. Transactions can only see the versions of rows that were valid at the time when the transaction started. In practice, this means that a row is visible for the transaction only if the Global Transaction Timestamp value at the start of transaction is between the BeginTs and EndTs timestamps of the row.

At some point, when the EndTs timestamp of a row is older than the Global Transaction Timestamp of the Oldest Active Transaction in the system, the row expires. Expired rows are invisible for active transactions and eventually they need to be deallocated to reclaim system memory and speed up index chain navigation. This process is called garbage collection.

The garbage collection process in In-Memory OLTP has been designed with the following goals:

  • Non-blocking: The garbage collection process should not block user threads and should produce minimal performance impact on the system.
  • Responsive: The garbage collection process should react to memory pressure.
  • Cooperative and Scalable: The garbage collection process should not rely on a single system thread to perform memory deallocation and should use regular worker threads during the process.

The cooperative nature of garbage collection makes it quite different from the typical SQL Server background processes. Even though there is a dedicated system garbage collection thread called the idle worker thread, the major part of the work is done by the regular user worker threads. This allows the process to scale and keep up with the workload in the system.

User threads participate in the garbage collection process in two different ways. They unlink old, expired rows from the row chains and perform actual deallocation. These actions are separate from each other, as you will see shortly.

Let’s look at the process in detail. Figure 9-1 illustrates the logical structure of a table with two hash indexes on the Name and City columns. You saw this figure in previous chapters; however, in this chapter I added another element called idxLinkCount, which indicates in how many index chains the rows are participating. It is displayed with the underline font in the figure; note that all rows have a value of two, which corresponds to the number of indexes in the table.

9781484211373_Fig09-01.jpg

Figure 9-1. Initial state of the data

Assume that you have a session that runs two queries, as shown in Listing 9-1, at time when the Oldest Active Transaction Timestamp is 110 and the Global Transaction Timestamp is 125.

The first SELECT scanned the Name index row chain for the bucket with value A and detected the Ann row with an EndTs of 100. The Oldest Active Transaction Timestamp is 110, so this row is expired and invisible for the active transactions in the system. As result, the user thread unlinked the row from the Name index row chain and decreased the idxLinkCnt value.

The second SELECT detects the deleted Carl row. However, the EndTs of this row is greater than the Oldest Active Transaction Timestamp, so this row can still be visible for some of the active transactions. Therefore, this row cannot be unlinked from the index chain. Figure 9-2 illustrates the state of the data after the execution of the queries.

9781484211373_Fig09-02.jpg

Figure 9-2. State of the data after the first two queries

Now, let’s assume that some of the active transactions were completed and you ran the second batch of the queries from Listing 9-2 at the time when the Oldest Active Transaction Timestamp was 120 and the Global Transaction Timestamp was 130.

The first SELECT found the expired Ann row in the City index chain and removed it from there. At this point, the row is not participating in any row chains and, therefore, can be deallocated. However, the row is not deallocated immediately; this is done at a later stage.

The Carl row now is also expired and invisible for the active transactions. The second SELECT removed it from the City index chain; however, it is still present in the Name index chain and cannot be deallocated. Figure 9-3 shows the state of the data at this moment.

9781484211373_Fig09-03.jpg

Figure 9-3. State of the data after the second two queries

Image Important  You should remember that the Oldest Active Transaction Timestamp controls when expired rows can be removed from the index chains and deallocated. Long-running and abandon transactions can defer garbage collection and lead to a situation when the system runs out of memory due to an excessive number of expired rows.

When the transaction is complete, In-Memory OLTP places the information about it in the queue used by the idle worker thread, which is responsible for garbage collection management. The idle worker thread wakes up every minute or, in case of a heavy load, when the number of completed transactions exceeds the predefined threshold. It analyzes the list of completed transactions and the Oldest Active Transaction Timestamp in the system, and separates completed transactions to 16 different queues called generations, sorting them based on their Global Transaction Timestamp values.

  • Generation 0 contains the list of transactions that were completed earlier than the current Oldest Active Transaction Timestamp. Rows generated by those transactions are immediately available for the garbage collection.
  • Generations 1-14 stores the list of transactions that were completed after the current Oldest Active Transaction Timestamp. Each generation can hold information about up to 16 transactions. As you can guess, a system can hold up to 224 transactions in generations 1-14 queues.
  • Generation 15 stores the information about the remaining transactions completed after the current Oldest Active Transaction Timestamp. There is no limit on the number of transactions that can be stored there.

Every transaction in the queue exposes its write set to the idle worker thread, which builds the set of the 16-row work items for deallocation. Those work items are distributed across another set of worker queues-one queue per scheduler-and then they are picked up and processed by the user threads. The user threads pick up the items and perform deallocation after they complete their work on the other user transactions.

Figure 9-4 illustrates an example of the garbage collection workflow in a system that has an Oldest Active Transaction Timestamp of 10,000.

9781484211373_Fig09-04.jpg

Figure 9-4. Garbage Collection Workflow

The user thread usually picks up the work items from the queue that belong to the same scheduler on which it is running. However, if the queue is empty, the thread checks the queues from the other CPUs that belong to the same NUMA node. Finally, in case of a heavy load in the system, the thread can pick up a work item from any queue, regardless of the NUMA node to which it belongs.

With the hot data and actively used indexes, user threads detect expired rows relatively quickly. However, with rarely used indexes and/or rarely accessed data, there is the possibility that expired rows may not be detected in a timely manner.

This is addressed by the idle worker thread, which periodically scans the indexes and detects expired rows there. The idle worker thread can either deallocate those rows immediately or add them to the work items after those rows have been unlinked from all index chains. This process is called a dusty corners scan.

As you can see, the garbage collection process in In-Memory OLTP is done asynchronously. Deleted rows and rows from aborted transactions continue to use system memory until they are deallocated. You need to remember this and reserve enough memory in the system to accommodate those rows.

Garbage Collection-Related Data Management Views

SQL Server exposes several data management views that can be used to monitor and analyze the garbage collection process.

  • sys.dm_xtp_gc_stats provides statistics about the garbage collection process. It includes information about the number of rows examined by the garbage collection subsystem, the number of rows processed by user and idle worker threads, and quite a few other attributes. You can read more about this view at https://msdn.microsoft.com/en-us/library/dn133196.aspx.
  • sys.dm_xtp_gc_queue_stats provides information about garbage collector worker queues. It provides information about total number of work items that were enqueued and dequeued, current queue length, last time the queue was accessed, and maximum depth the queue has seen. You can monitor the current queue length, making sure that the garbage collector is keeping up. More information is available at https://msdn.microsoft.com/en-us/library/dn268336.aspx.
  • sys.dm_db_xtp_gc_cycle_stats provides information about the last (up to 1,024) garbage collection execution cycles including the time and duration of the cycle, and distribution of transactions between generations. You can use this view to find spikes in the garbage collection activity and during long-running transactions troubleshooting. You can read more about this view at https://msdn.microsoft.com/en-us/library/dn268337.aspx.
  • Finally, sys.dm_db_xtp_index_stats includes several garbage collection-related metrics. The rows_expired column indicates how many rows have expired. Rows_expired_removed indicates the number of rows unlinked from the index chain. Phantom row columns provide information about rows inserted by aborted transactions. You can read more about this view at https://msdn.microsoft.com/en-us/library/dn133081.aspx.

Exploring the Garbage Collection Process

Let’s examine the garbage collection process and its asynchronous nature. As the first step, create a memory-optimized table and populate it with 65,536 rows, as shown in Listing 9-3.

Let’s look at amount of memory used in the table, index statistics, and garbage collection worker queues statistics using the code from Listing 9-4.

Figure 9-5 illustrates the output of the queries. As you can see, the table has about 586MB allocated and 512MB of used space. None of the rows have been deleted or touched (scanned). I also restarted my test server right before the test, so the garbage collection worker queues are empty.

9781484211373_Fig09-05.jpg

Figure 9-5. Memory and garbage collection statistics after table creation

Let’s run a few queries, analyzing the statistics after each run. In this book, I discuss results after each step; however, when you run this in your test environment, it is better to run all queries at once, persisting results in the temporary tables as is done in the script included with the companion materials of this book. This will help you to avoid the situation when idle worker threads start unexpectedly in the middle of execution.

As the first step, run the script that deletes 1,500 rows in the individual transactions (see Listing 9-5).

Now run the code from Listing 9-4 again and look at the output. As you can see in Figure 9-6, index statistics indicate that the deletion statement touched 1,500 rows; however, none of them were marked as expired even though deletion statements ran in the individual autocommitted transactions.

9781484211373_Fig09-06.jpg

Figure 9-6. Memory and garbage collection statistics after deletion

As the next step, run a SELECT query that scans the entire index, as shown in Listing 9-6.

Figure 9-7 illustrates the statistics after the scan. As you can see, In-Memory OLTP correctly identified rows as expired and unlinked them from the index row chains. However, none of the work items were enqueued in garbage collector worker items queues because the idle worker thread has not started yet.

9781484211373_Fig09-07.jpg

Figure 9-7. Memory and garbage collection statistics after scan

If you look at the statistics again after the idle worker thread execution, you will see the output shown in Figure 9-8. As you can see, the idle worker thread put items into the garbage collection worker queues where items are waiting for the user threads to deallocate them.

9781484211373_Fig09-08.jpg

Figure 9-8. Memory and garbage collection statistics after the idle worker thread cycle

If you scan the table with the query from Listing 9-6 again, you will see the statistics shown in Figure 9-9. A user thread processed and deallocated multiple items from the worker queues, releasing about 3MB of memory.

9781484211373_Fig09-09.jpg

Figure 9-9. Memory and garbage collection statistics after the second scan

The sys.dm_db_xtp_gc_cycle_stats view shows that the garbage collection idle worker thread performed just a handful of cycles (remember, I restarted SQL Server in my test environment before the test) and processed all completed transactions at once. You can see the partial output from the view in Figure 9-10.

9781484211373_Fig09-10.jpg

Figure 9-10. Sys.dm_db_xtp_gc_cycle_stats view after the test

The situation will change if you repeat entire test, deleting more rows from the table. The garbage collection process will be triggered based on the number of completed transactions in the queue rather than based on the timer.

Figure 9-11 shows the summary statistics from my environment when I repeated the test, deleting 32,768 rows in the individual transactions. Note that the garbage collection process was started at the middle of deletions rather than based on a timer.

9781484211373_Fig09-11.jpg

Figure 9-11. Memory and garbage collection statistics during the second set of tests

You can also confirm it by looking at the sys.dm_db_xtp_gc_cycle_stats view output in Figure 9-12. It shows a much higher number of cycles with very short delays in between them.

9781484211373_Fig09-12.jpg

Figure 9-12. Sys.dm_db_xtp_gc_cycle_stats view after the second test

Summary

The garbage collection process in In-Memory OLTP is designed to be non-blocking, cooperative, and scalable. Even though it is managed by a dedicated system thread (the idle worker thread) most of the work is done by the user threads. The idle worker thread wakes up every minute or when the number of completed transactions exceeds an internal threshold.

Deleted rows can be deallocated only after they are expired and their EndTs timestamp is older the than Oldest Active Transaction Timestamp in the system. Moreover, they need to be removed from all index row chains before deallocation. When user thread encounters an expired row, the thread unlinks it from the row chain. The idle worker thread periodically scans rarely accessed parts of the indexes during its dusty corners scan and processes expired rows that were missed by the user threads.

User threads provide information about completed transactions to the idle worker thread, which builds the list of work items that consist of 16-row batches to deallocate. The work items are distributed between garbage collector worker queues-one queue per scheduler in the system. In turn, user threads pickup one or several items from the worker queues and deallocate them.

Long-running and uncommitted transactions prevent rows from expiring by freezing the Oldest Active Transaction Timestamp in the system. This defers the garbage collection process and can lead to a situation where deleted rows use a large amount of memory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.37.126