Atomic Operations

Since the race condition in the previous example was caused by the compiler generating a sequence of three instructions to increment and decrement the instance variable retainCount, one solution is to replace the compiler output from the sequence load-modify-store with a single instruction that performs an equivalent operation. In this way, there is no chance for the operation to be interrupted when execution is preempted by another thread. In reality, however, it may not always be possible to replace an operation with a single instruction. Instead, we use a sequence of instructions that behave as if they were a single instruction. This is referred to as an “atomic” operation because the result of the operation is the same as if the instruction sequence had executed as a single, indivisible group.

The implementation of an atomic operation requires support from the CPU. For example, the Intel CPU used in Macintosh computers provides an instruction to atomically add one value to another value in memory. However, this alone is not enough to make the operation atomic in a multiprocessor environment. So the Intel instruction set provides a LOCK prefix that prevents any other CPU in the system from accessing memory while the instruction is executing. Since the implementation of atomic operations relies on support that is specific to the CPU architecture, iOS devices, which use the ARM instruction set, require a different implementation for each atomic operation.

To make it easy to access atomic operations in driver code, the I/O Kit includes a number of functions that provide an atomic implementation of basic operations, such as integer addition, incrementing and decrementing a value, and bitwise operations. These functions are listed in Table 7-1, which are defined in the header file <libkern/OSAtomic.h>.

image

image

With these functions at our disposal, we are now in a position to provide an implementation of retain() and release() that avoids the race condition that was present in the previous example. This is shown in Listing 7-2, which assumes that the instance variable retainCount is a 32-bit integer.

Listing 7-2. An Implementation of Object Reference Counting in a Multithreaded Environment

void    Object::retain ()
{
        OSIncrementAtomic(&retainCount);
}

void    Object::release ()
{
        uint32_t                originalValue;
        
        originalValue = OSDecrementAtomic(&retainCount);
        if (originalValue == 1)
                this->free();
}

If we go back and examine the original implementation in Listing 7-1 and the corresponding compiler output for the release() method, we can see that the code actually contained two race conditions. The conditional call to free() occurs when the value of retainCount has been decremented to 0. However, since the compiled code reloads the value of retainCount from memory before testing its value against 0, it's possible that two calls to release() both read the value 0 and the free() method is called twice for the object, which will likely result in a crash. To illustrate how this could occur, assume that one thread executing release() has decremented the retainCount from 2 to 1 and has written the decremented value back to memory. Also assume that, before it can reload the value of retainCount from memory and test whether its value is 0, the thread is preempted. Another thread now has a chance to run and, if it were to execute release(), the retainCount would be decremented from 1 to 0 and the object would be destroyed. When execution returns to the original thread, it will reload the value of retainCount, find that it is 0, and destroy the object a second time.

This race condition is avoided in Listing 7-2 by using the value returned by OSDecrementAtomic() to determine when the final reference count has been released. The function OSDecrementAtomic() returns the original value of its parameter before it was decremented. We know that if the original value was 1, the value of retainCount has now been decremented to 0 and the object can safely be destroyed.

One group of atomic operations that deserves special mention is the compare-and-swap family of functions. The compare-and-swap operation writes a value to a memory address but, importantly, the write will only take place if the value that is being overwritten is equal to some expected value that is provided by the caller. The result of the operation is a Boolean value that indicates whether the write succeeded. Importantly, for the purposes of synchronization, the entire operation is performed atomically.

The compare and swap function can be used to build more complex atomic operations. For example, suppose we wish to implement a function to perform a bitwise AND followed by a bitwise OR, with the overall operation being atomic. Clearly, we cannot simply call OSBitAndAtomic() followed by OSBitOrAtomic() because there is nothing to prevent the execution from being preempted between the two functions. With the OSCompareAndSwap() function at our disposal, we can implement a function that atomically performs a bitwise AND followed by a bitwise OR as follows:

uint32_t        AtomicBitAndOr (uint32_t andMask, uint32_t orMask, volatile uint32_t* address)
{
        uint32_t        oldValue;
        uint32_t        newValue;
        
        do {
                oldValue = *address;
                newValue = oldValue & andMask;
                newValue = newValue | orMask;
        } while (OSCompareAndSwap(oldValue, newValue, address) == false);
        
        return oldValue;
}

You will note that we have no synchronization at all while we perform the two bitwise operations. The reason that this implementation works and is atomic is because it uses the OSCompareAndSwap() function to ensure that the value at address hasn't changed from the original value on which we based our calculation of the new value to be written. If another thread had modified the value at address while this function was executing, then the OSCompareAndSwap() function would return false and would not perform the write. As a result, we would have to go back to the beginning of the loop and repeat the entire bitwise operation after re-reading the value at address. On this next attempt, we hope to have better luck in performing the operation without another thread modifying the value at address underneath us, although we will continue retrying until we successfully write the result to memory.

images Note All atomic operations, such as OSAddAtomic(), OSIncrementAtomic(), and OSBitOrAtomic() can be implemented using only OSCompareAndSwap(). In fact, a number of atomic functions provided by the libkern library are implemented this way, including all bitwise atomic operations and the 8-bit and 16-bit variations of each operation, which perform a compare and swap on the full 32-bit word containing the value being modified.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.118.232