Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

18 Transactional Memory

18.1 Introduction

We now turn our attention from devising data structures and algorithms to critiquing the tools we use to solve these problems. These tools are the synchronization primitives provided by today’s architectures, encompassing various kinds of locking, both spinning and blocking, and atomic operations such as compareAndSet() and its relatives. They have mostly served us well. We, the community of multiprocessor programmers, have been able to construct many useful and elegant data structures. Nevertheless, everyone knows that the tools are flawed. In this chapter, we review and analyze the strengths and weaknesses of the standard synchronization primitives, and describe some emerging alternatives that are likely to extend, and perhaps even to displace many of today’s standard primitives.

18.1.1 What is Wrong with Locking?

Locking, as a synchronization discipline, has many pitfalls for the inexperienced programmer. Priority inversion occurs when a lower-priority thread is preempted while holding a lock needed by higher-priority threads. Convoying occurs when a thread holding a lock is descheduled, perhaps by exhausting its scheduling quantum by a page fault, or by some other kind of interrupt. While the thread holding the lock is inactive, other threads that require that lock will queue up, unable to progress. Even after the lock is released, it may take some time to drain the queue, in much the same way that an accident can slow traffic even after the debris has been cleared away. Deadlock can occur if threads attempt to lock the same objects in different orders. Deadlock avoidance can be awkward if threads must lock many objects, particularly if the set of objects is not known in advance. In the past, when highly scalable applications were rare and valuable, these hazards were avoided by deploying teams of dedicated expert programmers. Today, when highly scalable applications are becoming commonplace, the conventional approach is just too expensive.

The heart of the problem is that no one really knows how to organize and maintain large systems that rely on locking. The association between locks and data is established mostly by convention. Ultimately, it exists only in the mind of the programmer, and may be documented only in comments. Fig. 18.1 shows a typical comment from a Linux header file¹ describing the conventions governing the use of a particular kind of buffer. Over time, interpreting and observing many such conventions spelled out in this way may complicate code maintenance.

Figure 18.1 Synchronization by convention: a typical comment from the Linux kernel.

18.1.2 What is Wrong with compareAndSet()?

One way to bypass the problems of locking is to rely on atomic primitives like compareAndSet(). Algorithms that use compareAndSet() and its relatives are often hard to devise, and sometimes, though not always, have a high overhead. The principal difficulty is that nearly all synchronization primitives, whether reading, writing, or applying an atomic compareAndSet(), operate only on a single word. This restriction often forces a complex and unnatural structure on algorithms.

Let us review the lock-free queue of Chapter 10 (reproduced in Fig. 18.2), this time with an eye toward the underlying synchronization primitives.

Figure 18.2 The LockFreeQueue class: the enq() method.

A complication arises between Lines 12 and 13. The enq() method calls compareAndSet() to change both the tail node’s next field and the tail field itself to the new node. Ideally, we would like to atomically combine both compareAndSet() calls, but because these calls occur one-at-a-time both enq() and deq() must be prepared to encounter a half-finished enq() (Line 12). One way to address this problem is to introduce a multiCompareAndSet() primitive, as shown in Fig. 18.3. This method takes as arguments an array of AtomicReference<T> objects, an array of expected T values, and an array of T-values used for updates. It performs a simultaneous compareAndSet() on all array elements, and if any one fails, they all do. In more detail: if, for all i, the value of target[i] is expected[i], then set target[i]’s value to update[i] and return true. Otherwise leave target[i] unchanged, and return false.

Figure 18.3 Pseudocode for multiCompareAndSet(). This code is executed atomically.

Note that there is no obvious way to implement multiCompareAndSet() on conventional architectures. If there were, comparing the LockFreeQueue implementations in Figs. 18.2 and 18.4 illustrates how multiCompareAndSet() simplifies concurrent data structures. The complex logic of Lines 11–12 is replaced by a call to a single multiCompareAndSet() call.

Figure 18.4 The LockFreeQueue class: simplified enq() method with multiCompareAndSet().

While multi-word extensions such as multiCompareAndSet() might be useful, they do not help with another serious flaw, discussed in Section 18.1.3.

18.1.3 What is Wrong with Compositionality?

All the synchronization mechanisms we have considered so far, with or without locks, have a major drawback: they cannot easily be composed. Let us imagine that we want to dequeue an item x from queue q0 and enqueue it at another, q1. The transfer must be atomic: no concurrent thread should observe either that x has vanished, or that it is present in both queues. In Queue implementations based on monitors, each method acquires the lock internally, so it is essentially impossible to combine two method calls in this way.

Failure to compose is not restricted to mutual exclusion. Let us consider a bounded queue class whose deq() method blocks as long as the queue is empty (using either wait/notify or explicit condition objects). We imagine that we have two such queues, and we want to dequeue an item from either queue. If both queues are empty, then we want to block until an item shows up in either one. In Queue implementations based on monitors, each method provides its own conditional waiting, so it is essentially impossible to wait on two conditions in this way.

Naturally, there are always ad hoc solutions. For the atomic transfer, we could introduce a lock to be acquired by any thread attempting an atomic modification to both q0 and q1. But such a lock would be a concurrency bottleneck (no concurrent transfers) and it requires knowing in advance the identities of the two queues. Or, the queues themselves might export their synchronization state, (say, via lock() and unlock() methods), and rely on the caller to manage multi-object synchronization. Exposing synchronization state in this way would have a devastating effect on modularity, complicating interfaces, and relying on callers to follow complicated conventions. Also, this approach simply would not work for nonblocking queue implementations.

18.1.4 What can We Do about It?

We can summarize the problems with conventional synchronization primitives as follows.

Locks are hard to manage effectively, especially in large systems.

Atomic primitives such as compareAndSet() operate on only one word at a time, resulting in complex algorithms.

It is difficult to compose multiple calls to multiple objects into atomic units.

In Section 18.2, we introduce transactional memory, an emerging programming model that proposes a solution to each of these problems.

18.2 Transactions and Atomicity

A transaction is a sequence of steps executed by a single thread. Transactions must be serializable, meaning that they appear to execute sequentially, in a one-at-a-time order. Serializability is a kind of coarse-grained version of linearizability. Linearizability defined atomicity of individual objects by requiring that each method call of a given object appear to take effect instantaneously between its invocation and response, Serializability, on the other hand, defines atomicity for entire transactions, that is, blocks of code that may include calls to multiple objects. It ensures that a transaction appears to take effect between the invocation of its first call and the response to its last call.² Properly implemented, transactions do not deadlock or livelock.

We now describe some simple programming language extensions to Java that support a transactional model of synchronization. These extensions are not currently part of Java, but they illustrate the model. The features described here are a kind of average of features provided by contemporary transactional memory systems. Not all systems provide all these features: some provide weaker guarantees, some stronger. Nevertheless, understanding these features will go a long way toward understanding modern transactional memory models.

The atomic keyword delimits a transaction in much the same way the synchronized keyword delimits a critical section. While synchronized blocks acquire a specific lock, and are atomic only with respect to other synchronized blocks that acquire the same lock, an atomic block is atomic with respect to all other atomic blocks. Nested synchronized blocks can deadlock if they acquire locks in opposite orders, while nested atomic blocks cannot.

Because transactions allow atomic updates to multiple locations, they eliminate the need for multiCompareAndSet(). Fig. 18.5 shows the enq() method for a transactional queue. Let us compare this code with the lock-free code of Fig. 18.2: there is no need for AtomicReference fields, compareAndSet() calls, or retry loops. Here, the code is essentially sequential code bracketed by atomic blocks.

Figure 18.5 An unbounded transactional queue: the enq() method.

To explain how transactions are used to write concurrent programs, it is convenient to say something about how they are implemented. Transactions are executed speculatively: as a transaction executes, it makes tentative changes to objects. If it completes without encountering a synchronization conflict, then it commits (the tentative changes become permanent) or it aborts (the tentative changes are discarded).

Transactions can be nested. Transactions must be nested for simple modularity: one method should be able to start a transaction and then call another method without caring whether the nested call starts a transaction. Nested transactions are especially useful if a nested transaction can abort without aborting its parent. This property will be important when we discuss conditional synchronization later on.

Recall that atomically transferring an item from one queue to another was essentially impossible with objects that use internal monitor locks. With transactions, composing such atomic method calls is almost trivial. Fig. 18.7 shows how to compose a deq() call that dequeues an item x from a queue q0 and an enq(x) call that enqueues that item to another queue q1.

Figure 18.6 A bounded transactional queue: the enq() method with retry.

Figure 18.7 Composing atomic method calls.

What about conditional synchronization? Fig. 18.6 shows the enq() method for a bounded buffer. The method enters an atomic block (Line 2), and tests whether the buffer is full (Line 3). If so, it calls retry (Line 4), which rolls back the enclosing transaction, pauses it, and restarts it when the object state has changed. Conditional synchronization is one reason it may be convenient to roll back a nested transaction without rolling back the parent. Unlike the wait() method or explicit condition variables, retry does not easily lend itself to lost wake-up bugs.

Recall that waiting for one of several conditions to become true was impossible using objects with internal monitor condition variables. A novel aspect of retry is that such composition becomes easy. Fig. 18.8 shows a code snippet illustrating the orElse statement, which joins two or more code blocks. Here, the thread executes the first block (Line 2). If that block calls retry, then that subtransaction is rolled back, and the thread executes the second block (Line 4). If that block also calls retry, then the orElse as a whole pauses, and later reruns each of the blocks (when something changes) until one completes.

Figure 18.8 The orElse statement: waiting on multiple conditions.

In the rest of this chapter, we examine techniques for implementing transactional memory. Transactional synchronization can be implemented in hardware (HTM), in software (STM), or both. In the following sections, we examine STM implementations.

18.3 Software Transactional Memory

Unfortunately, the language support sketched in Section 18.2 is not currently available. Instead, this section describes how to support transactional synchronization using a software library. We present TinyTM, a simple Software Transactional Memory package that could be the target of the language extensions described in Section 18.2. For brevity, we ignore such important issues as nested transactions, retry, and orElse. There are two elements to a software transactional memory construction: the threads that run the transactions, and the objects that they access.

We illustrate these concepts by walking though part of a concurrent SkipList implementation, like those found in Chapter 14. This class uses a skiplist to implement a set providing the usual methods: add(x) adds x to the set, remove(x) removes x from the set, and contains(x) returns true if, and only if x is in the set.

Recall that a skiplist is a collection of linked lists. Each node in the list contains an item field (an element of the set), a key field (the item’s hash code), and a next field, which is an array of references to successor nodes in the list. Array slot zero refers to the very next node in the list, and higher-numbered slots refer to successively later successors. To find a given key, search first though higher levels, moving to a lower level each time a search overshoots. In this way, one can find an item in time logarithmic in the length of the list. (Refer to Chapter 14 for a more complete description of skiplists.)

Using TinyTM, threads communicate via shared atomic objects, which provide synchronization, ensuring that transactions cannot see one another’s uncommitted effects, and recovery, undoing the effects of aborted transactions. Fields of atomic objects are not directly accessible. Instead, they are accessed indirectly through getter and setter methods. For example, the getter method for the key field has the form

int getKey();

while the matching setter method has the form

void setKey(int value);

Accessing fields through getters and setters provides the ability to interpose transactional synchronization and recovery on each field access. Fig. 18.9 shows the complete SkipNode interface.

Figure 18.9 The SkipNode interface.

In a similar vein, the transactional SkipList implementation cannot use a standard array, because TinyTM cannot intercept access to the array. Instead, TinyTM provides an AtomicArray<T> class that serves the same functionality as a regular array.

Fig. 18.10 shows the fields and constructor for the SkipListSet class, and Fig. 18.11 shows the code for the add() method. Except for the syntactic clutter caused by the getters and setters, this code is almost identical to that of a sequential implementation. (The getter and setter calls could be generated by a compiler or preprocessor, but here we will make the calls explicit.) In Line 21 we create a new TSkipNode (transactional skip node) implementing the SkipNode interface. We will examine this class later on.

Figure 18.10 The SkipListSet class: fields and constructor.

Figure 18.11 The SkipListSet class: the add() method.

A transaction that returns a value of type T is implemented by a Callable<T> object (see Chapter 16) where the code to be executed is encapsulated in the object’s call() method.

A transactional thread (class TThread) is a thread capable of running transactions. A TThread (Fig. 18.12) runs a transaction by calling the doIt() method with the Callable<T> object as an argument. Fig. 18.13 shows a code snippet in which a transactional thread inserts a sequence of values into a skiplist tree. The list variable is a skiplist shared by multiple transactional threads. Here, the argument to doIt() is an anonymous inner class, a Java construct that allows short-lived classes to be declared in-line. The result variable is a Boolean indicating whether the value was already present in the list.

Figure 18.12 The TThread class.

Figure 18.13 Adding items to an integer list.

We now describe the TinyTM implementation in detail. In Section 18.3.1, we describe the transactional thread implementation, and then we describe how to implement transactional atomic objects.

18.3.1 Transactions and Transactional Threads

A transaction’s status is encapsulated in a thread-local Transaction object (Fig. 18.14) which can assume one of three states: ACTIVE, ABORTED, or COMMITTED (Line 2). When a transaction is created, its default status is ACTIVE (Line 11). It is convenient to define a constant Transaction. COMMITTED transaction object for threads that are not currently executing within a transaction (Line 3). The Transaction class also keeps track of each thread’s current transaction through a thread-local field local (Lines 5–8).

Figure 18.14 The Transaction class.

The commit() method tries to change the transaction state from ACTIVE to COMMITTED (Line 19), and the abort() method from ACTIVE to ABORTED(Line 22). A thread can test its current transaction state by calling getStatus() (Line 16). If a thread discovers that its current transaction has been aborted, it throws AbortedException. A thread can get and set its current transaction by calling static getLocal() or setLocal() methods.

The TThread (transactional thread) class is a subclass of the standard Java Thread class. Each transactional thread has several associated handlers. The onCommit and onAbort handlers are called when a transaction commits or aborts, and the validation handler is called when a transaction is about to commit. It returns a Boolean indicating whether the thread’s current transaction should try to commit. These handlers can be defined at run-time. Later, we will see how these handlers can be used to implement different techniques for transaction synchronization and recovery.

The doIt() method (Line 5 of Fig. 18.12) takes a Callable<T> object and executes its call() method as a transaction. It creates a new ACTIVE transaction (Line 8), and calls the transaction’s call() method. If that method throws AbortedException (Line 12), then the doIt() method simply retries the loop. Any other exception means the application has made an error (Line 13), and (for simplicity) the method throws PanicException, which prints an error message and shuts down everything. If the transaction returns, then doIt() calls the validation handler to test whether to commit (Line 16), and if the validation succeeds, then it tries to commit the transaction (Line 17). If the commit succeeds, then it runs the commit hander and returns (Line 18). Otherwise, if validation fails, it explicitly aborts the transaction. If commit fails for any reason, it runs the abort handler before retrying (Line 22).

18.3.2 Zombies and Consistency

Synchronization conflicts cause transactions to abort, but it is not always possible to halt a transaction’s thread immediately after the conflict occurs. Instead, such zombie³ transactions may continue to run even after it has become impossible for them to commit. This prospect raises another important design issue: how to prevent zombie transactions from seeing inconsistent states.

Here is how an inconsistent state could arise. An object has two fields x and y, initially 1 and 2. Each transaction preserves the invariant that y is always equal to 2x. Transaction Z reads y, and sees value 2. Transaction A changes x and y to 2 and 4, respectively, and commits. Z is now a zombie, since it keeps running, but will never commit. Z later reads x, and sees the value 2, which is inconsistent with the value it read for y.

One approach is to deny that inconsistent states are a problem. Since zombie transactions must eventually abort, their updates will be discarded, so why should we care what they observe? Unfortunately, a zombie can cause problems, even if its updates never take effect. In the scenario described earlier, where y = 2x in every consistent state, but Z has read the inconsistent value 2 for both x and y, if Z evaluates the expression

1/(x-y)

it will throw an “impossible” divide-by-zero exception, halting the thread, and possibly crashing the application. For the same reason, if Z now executes the loop

int i = x + 1; // i is 3

while (i++ != y) { // y is actually 2, should be 4

…

}

it would never terminate.

There is no practical way to avoid “impossible” exceptions and infinite loops in a programming model where invariants cannot be relied on. As a result, TinyTM guarantees that all transactions, even zombies, see consistent states.

18.3.3 Atomic Objects

As mentioned earlier, concurrent transactions communicate through shared atomic objects. As we have seen (Fig. 18.9), access to an atomic object is provided by a stylized interface that provides a set of matching getter and setter methods. The AtomicObject interface appears in Fig. 18.15.

Figure 18.15 The AtomicObject<T> abstract class.

We will need to construct two classes that implement this interface: a sequential implementation that provides no synchronization or recovery, and a transactional implementation that does. Here too, these classes could easily be generated by a compiler, but we will do these constructions by hand.

The sequential implementation is straightforward. For each matching getter–setter pair, for example:

T getItem();

void setItem(T value);

the sequential implementation defines a private field item of type T. We also require the sequential implementation to satisfy a simple Copyable<T> interface that provides a copyTo() method that copies the fields of one object to another (Fig. 18.16). As a technical matter, we also require the type to provide a no-argument constructor. For brevity, we use the term version to refer to an instance of a sequential, Copyable<T> implementation of an atomic object interface.

Figure 18.16 The Copyable<T> interface.

Fig. 18.17 shows the SSkipNode class, a sequential implementation of the SkipNode interface. This class has three parts. The class must provide a no-argument constructor for use by atomic object implementations (as described later), and may provide any other constructors convenient to the class. Second, it provides the getters and setters defined by the interface, where each getter or setter simply reads or writes its associated field. Finally, the class also implements the Copyable interface, which provides a copyTo() method that initializes one object’s fields from another’s. This method is needed to make back-up copies of the sequential object.

Figure 18.17 The SSkipNode class: a sequential SkipNode implementation.

18.3.4 Dependent or Independent Progress?

One goal of transactional memory is to free the programmer from worrying about starvation, deadlock, and “the thousand natural shocks” that locking is heir to. Nevertheless, those who implement STMs must decide which progress condition to meet.

We recall from Chapter 3 that implementations that meet strong independent progress conditions such as wait-freedom or lock-freedom guarantee that a thread always makes progress. While it is possible to design wait-free or lock-free STM systems, no one knows how to make them efficient enough to be practical.

Instead, research on nonblocking STMs has focused on weaker dependent progress conditions. There are two approaches that promise good performance: nonblocking STMs that are obstruction-free, and blocking, lock-based STMs that are deadlock-free. Like the other nonblocking conditions, obstruction-freedom ensures that not all threads can be blocked by delays or failures of other threads. This property is weaker than lock-free synchronization, because it does not guarantee progress when two or more conflicting threads are executing concurrently.

The deadlock-free property does not guarantee progress if threads halt in critical sections. Fortunately, as with many of the lock-based data structures we saw earlier, scheduling in modern operating systems can minimize the possibility of threads getting swapped out in the middle of a transaction. Like obstruction-freedom, deadlock-freedom does not guarantee progress when two or more conflicting threads are executing concurrently.

For both the nonblocking obstruction-free and blocking deadlock-free STMs, progress for conflicting transactions is guaranteed by a contention manager, a mechanism that decides when to delay contending threads, through spinning or yielding, so that some thread can always make progress.

18.3.5 Contention Managers

In TinyTM, as in many other STMs, a transaction can detect when it is about to cause a synchronization conflict. The requester transaction then consults a contention manager. The contention manager serves as an oracle,⁴ advising the transaction whether to abort the other transaction immediately, or stall to allow the other a chance to complete. Naturally, no transaction should stall forever waiting for another.

Fig. 18.18 shows a simplified base class for contention managers. It provides a single method, resolve() (Line 12), that takes two transactions, the requester’s and the other’s, and either pauses the requester, or aborts the other. It also keeps track of each thread’s local contention manager (Line 2), accessible by getLocal() and setLocal() methods (Lines 16 and 13).

Figure 18.18 Contention manager base class.

The ContentionManager class is abstract because it does not implement any conflict resolution policy. Here are some possible contention manager policies. Suppose transaction A is about to conflict with transaction B.

Backoff: A repeatedly backs off for a random duration, doubling the expected time up to some limit. When that limit is reached, it aborts B.

Priority: Each transaction takes a timestamp when it starts. If A has an older timestamp than B, it aborts B, and otherwise it waits. A transaction that restarts after an abort keeps its old timestamp, ensuring that every transaction eventually completes.

Greedy: Each transaction takes a timestamp when it starts. A aborts B if either A has an older timestamp than B, or B is waiting for another transaction. This strategy eliminates chains of waiting transactions. As in the priority policy, every transaction eventually completes.

Karma: Each transaction keeps track of how much work it has accomplished, and the transaction that has accomplished more has priority.

Fig. 18.19 shows a contention manager implementation that uses the back-off policy. The manager imposes minimum and maximum delays (Lines 2–3). The resolve() method checks whether this is the first time it has encountered the other thread (Line 8). If so, it resets its delay to the minimum, and otherwise it uses its current delay. If the current delay is less than the maximum, the thread sleeps for a random duration bounded by the delay (Line 13), and doubles the next delay. If the current delay exceeds the maximum, the caller aborts the other transaction (Line 16).

Figure 18.19 A simple contention manager implementation.

18.3.6 Implementing Atomic Objects

Linearizability requires that individual method calls appear to take place atomically. We now consider how to guarantee serializability: that multiple atomic calls have the same property.

A transactional implementation of an atomic object must provide getter and setter methods that invoke transactional synchronization and recovery. We review two alternative approaches to synchronization and recovery: the FreeObject class is obstruction-free, while the LockObject class uses locking for synchronization. These alternatives are implementations of the abstract AtomicObject class, shown in Fig. 18.15. The init() method takes the atomic object’s class as argument and records it for later use. The openRead() method returns a version suitable for reading (that is, one can call its getter methods only), while the openWrite() method returns a version that may be written (that is, one can call both getters and setters).

The validate() method returns true if and only if the value to be returned is guaranteed to be consistent. It is necessary to call validate() before returning any information extracted from an atomic object. The openRead(), openWrite(), and validate() methods are all abstract.

Fig. 18.20 shows the TSkipNode class a transactional SkipNode implementation. This class uses the LockObject atomic object implementation for synchronization and recovery (Line 8).

Figure 18.20 The TSkipNode class: a transactional SkipNode implementation.

This class has a single AtomicObject<SSkipNode> field. The constructor takes as argument values to initialize the AtomicObject<SSkipNode> field. Each getter performs the following sequence of steps.

1. It calls openRead() to extract a version.

2. It calls the version’s getter to extract the field value, which it stores in a local variable.

3. It calls validate() to ensure the value read is consistent.

The last step is needed to ensure that the object did not change between the first and second step, and that the value recorded in the second step is consistent with all the other values observed by that transaction.

Setters are implemented in a symmetric way, calling the setter in the second step.

We now have two alternative atomic object implementations. The implementations will be relatively unoptimized to simplify the presentation.

18.3.7 An Obstruction-Free Atomic Object

Recall that an algorithm is obstruction-free if any thread that runs by itself for long enough makes progress. In practice, this condition means that a thread makes progress if it runs for long enough without a synchronization conflict from a concurrent thread. Here, we describe an obstruction-free implementation of AtomicObject.

Bird’s-Eye View

Each object has three logical fields: an owner field, an old version, and a new version. (We call them logical fields because they may not be implemented as fields.) The owner is the last transaction to access the object. The old version is the object state before that transaction arrived, and the new version reflects that transaction’s updates, if any. If owner is COMMITTED, then the new version is the current object state, while if it is ABORTED, then the old version is current. If the owner is ACTIVE, there is no current version, and the future current version depends on whether the owner commits or aborts.

When a transaction starts, it creates a Transaction object to hold the transaction’s status, initially ACTIVE. If that transaction commits, it sets the status to COMMITTED, and if it is aborted by another transaction, the other transaction sets the status to ABORTED.

Each time transaction A accesses an object it first opens that object, possibly resetting the owner, old value, and new value fields. Let B be the object’s prior owner.

1. If B was COMMITTED, then the new version is current. A installs itself as the object’s current owner, sets the old version to the prior new version, and the new version to a copy of the prior new version (if the call is a setter), or to the new version itself (if the call is a getter).

2. Symmetrically, if B was ABORTED, then the old version is current. A installs itself as the object’s current owner, sets the old version to the prior old version, and the new version to a copy of the prior old version (if the call is a setter), or to the old version itself (if the call is a getter).

3. If B is still ACTIVE, then A and B conflict, so A consults the contention manager for advice whether to abort B, or to pause, giving B a chance to finish. One transaction aborts another by successfully calling compareAndSet() to change the victim’s status to ABORTED.

We leave it to the readers to extend this algorithm to allow concurrent readers.

After opening the object, the getter reads the version’s field into a local variable. Before returning that value, it calls validate() to check that the calling transaction has not been aborted. If all is well, it returns the field value to the caller. (Setters work symmetrically).

When it is time for A to commit, it calls compareAndSet() to change its status to COMMITTED. If it succeeds, the commit is complete. The next transaction to access an object owned by A will observe that A has committed, and will treat the object’s new version (the one installed by A) as current. If it fails, it has been aborted by another transaction. The next transaction to access an object updated by A will observe that A has been aborted, and will treat the object’s old version (the one prior to A) as current. Fig. 18.21 shows an example execution.

Figure 18.21 The FreeObject class: an obstruction-free atomic object implementation. Thread A has completed the writing of one object and is in the process of switching a copy of the second object that was last written by thread B. It prepares a new locator with a fresh new copy of the object and an old object field that refers to the new field of thread B’s locator. It then uses a compareAndSet() to switch the object to refer to the newly created locator.

Why It Works

Here is why every transaction observes a consistent state. When a transaction A calls a getter method to read an object field, it opens the object, installing itself as the object’s owner. If the object already had an active owner, B, then A aborts B. A then reads the field value into a local variable. Before the getter returns that value to the application, however, it calls validate() to check that the value is consistent. If another transaction C displaced A as owner of any object, then C aborted A, and A’s validation fails. It follows that if a setter returns a value, that value is consistent.

Here is why transactions are serializable. If a transaction A successfully changes its status from ACTIVE to COMMITTED, then it must still be owner of all the objects it accessed, because any transaction that usurps A’s ownership must abort A first. It follows that none of the objects it read or wrote have changed since A accessed them, so A is effectively updating a snapshot of the objects it accessed.

In Detail

Opening an object requires changing multiple fields atomically, modifying the owner, old version, and new version fields. Without locks, the only way to accomplish this atomic multi-field update is to introduce a level of indirection. As shown in Fig. 18.22, the FreeObject class has a single field, start, that is an AtomicReference to a Locator object (Line 3), which holds the object’s current transaction, old version, and new version (Lines 5–7).

Figure 18.22 The FreeObject class: the inner Locator class.

Recall that an object field is modified in the following steps: (1) call openWrite() to acquire an object version, (2) tentatively modify that version, and (3) call validate() to ensure the version is still good. Fig. 18.23 shows the FreeObject class’s openWrite() method. First, the thread tests its own transactional state (Line 14). If that state is committed, then the thread is not running in a transaction, and updates the object directly (Line 15). If that state is aborted, then the thread immediately throws an AbortedException exception (Line 16). Finally, if the transaction is active, then the thread reads the current locator and checks whether it has already opened this object for writing, returning immediately if so (Line 19). Otherwise, it enters a loop (Line 22) where it repeatedly initializes and tries to install a new locator. To determine the object’s current value, the thread checks the status of the last transaction to write the object (Line 25), using the new version if the owner is committed (Line 27) and the old version if it is ABORTED (Line 30). If the owner is still active (Line 30), there there is a synchronization conflict, and the thread calls the contention manager module to resolve the conflict. In the absence of conflict, the thread creates and initializes a new version (Lines 37–39). Finally, the thread calls compareAndSet() to replace the old locator with the new, returning if it succeeds, and retrying if it fails.

Figure 18.23 The FreeObject class: the openWrite() method.

The openRead() method (not shown) works in the same way, except that it does not need to make a copy of the old version.

The FreeObject class’s validate() method (not shown) simply checks that current thread’s transaction status is ACTIVE.

18.3.8 A Lock-Based Atomic Object

The obstruction-free implementation is somewhat inefficient because writes continually allocate locators and versions and reads must go through two levels of indirection (two references) to reach the actual data to be read. In this section, we present a more efficient atomic object implementation that uses short critical sections to eliminate the need for a locator and to remove a level of indirection.

A lock-based STM could lock each object as it is read or written. Many applications, however, follow the 80/20 rule: roughly 80% of accesses are reads and roughly 20% are writes. Locking an object is expensive, since it requires a compareAndSet() call, which seems excessive when read/write conflicts are expected to be infrequent. Is it really necessary to lock objects for reading? The answer is no.