i
i
i
i
i
i
i
i
10.3. BASIC DIRECTORY CACHE COHERENCE PROTOCOL 341
to all current sharers. The data reply message sent to the requestor also includes the number of
invalidation acknowledgments the requestor can expect to receive. The directory state transitions to
EM. If it is an upgrade request, however, the directory sends invalidations to sharers, and a reply
without data to the requestor.
Finally, when the directory state is EM, if a read request is received, then the new state will be
shared, and an intervention request is sent to the owner of the block. The intervention request also
contains the ID of the requestor so that the owner knows where to send the block. If a read exclusive
request is received, then a new processor wants to write to the block, so the directory needs to send
an invalidation to the current owner. The invalidation request also contains the ID of the requestor
so that the owner knows where to send the block. Note that the block may be in either exclusive
or modified state at the owner; the directory does not know which is the case. In either case, upon
receiving intervention or invalidation, the owner must respond appropriately.
To illustrate the operation of the protocol, consider an example of a stream of memory accesses
to a single memory block shown in Table 10.1, assuming a three-node multiprocessor in which each
node has a processor, a cache, and a memory with a directory of all local memory blocks. We
assume the directory format consists of the directory state and a full-bit sharing vector. The table
shows requests made by various processors in the following format: Rx or Wx, where R stands for
a read request, W stands for a write request, and x stands for the ID of the processor making the
request. Messages generated as a result of the processor request are also shown, in the following
format: Msgtype(Src Dest).
Note that in this illustration, we assume that all messages that correspond to each request occur
atomically, i.e., messages from different requests are not overlapped. Overlapping of messages from
different requests can introduce potential correctness problems and must be handled correctly. They
will be discussed in later sections. For now, we assume that requests are processed one at a time.
The example does not make any assumption on where the home of the block resides. Each time
a message is sent from a processor to another processor or the home node, it is counted as one
hop and is shown in one line. A message that cannot be sent before a prior message is received
indicates message dependence, and they must be sent in two different hops. Messages that do not
have dependence can be sent in parallel, denoted with the symbol “//” in the table.
Initially, the block is uncached, and hence the directory content indicates the uncached (U) state,
and the full bit sharing vector has 0 in all bits. When processor P1 wants to read the block, it sends
a read request to the home node (first hop). The home node checks the directory and finds that
the block is uncached, hence it has the latest value of the block. It then sends a ReplyD message
containing the data value of the block to P1. It also changes the state of the block to EM, with the
sharing vector bit for P1 set to “1” (the sharing full-bit vector becomes “100”).
Next, P1 wants to write to the block that it has just brought into its cache. It checks the state of
the block, and since the block is in the exclusive state, it changes the state to modified and writes to
the block silently, i.e., without sending any messages. The directory is unaware of this action, but it
does not need to, because the directory state of the block is already EM.
When P3 wants to read from the block, it suffers a cache miss, and this causes a Read message
to be sent to the home node. The home node checks its directory and finds that the state is EM and
the block is exclusively cached by P1. Since it does not know whether the block is clean or dirty
in P1, it assumes that the block is dirty and sends a intervention message to it. In the message, it
includes the ID of P3 so that later P1 can send the data directly to P3. Home then includes P3 as a
i
i
i
i
i
i
i
i
342 CHAPTER 10. ADVANCED CACHE COHERENCE ISSUES
Table 10.1: Illustration of directory-based coherence protocol operations.
Request P1 P2 P3 Directory Messages #Hops
Initially U, 000
R1 E EM, 100 Read(P1H) 2
ReplyD(HP1)
W1 M EM, 100 0
R3 S S S, 101 Read(P3H) 3
Int(HP1)
Flush(P1H // P1P3)
W3 I M EM, 001 Upgr(P3H) 3
Reply(HP3) // Inv(HP1)
InvAck(P1P3)
R1 S S S, 101 Read(P1H) 3
Int(HP3)
Flush(P3P1 // P3H)
R3 S S S, 101 0
R2 S S S S, 111 Read(P2H) 2
ReplyD(HP2)
sharer by updating its sharing vector to “101”. When P1 receives the intervention request, it sends
its data block to P3 using the flush message. It downgrades its block state to shared. In MESI, the
shared state implies a clean block, thus it also needs to send a separate flush message to the home
node containing the data value of the block. The home node uses the flushed block to update the
block in the local memory. This transaction proceeds in three hops: from P3 to home, home to P1,
and P1 to home and P3.
After reading the block, P3 wants to write to it. It finds the block in the shared state in its cache,
so it does not have permission to write to the block. Hence, it sends an upgrade request using the
Upgr message to the home node. The home node finds that currently P1 also caches the block and
must be invalidated. Thus, it sends an Inv message to P1. In parallel, it sends a reply to P3 to notify
it to expect an invalidation acknowledgment from P1. Home also changes the directory state to
EM, and removes P1 from its sharing vector, so the new sharing bit vector value is “001”. When
P1 receives the invalidation message from home, it invalidates its block and sends an invalidation
acknowledgment message (InvAck) to P3. When P3 receives the InvAck from P1, it knows that it
can now change its block state to modified and write to the block.
Later, P1 wants to read from the block that was just invalidated in the previous event. P1 suffers
a cache miss as it finds the block in state invalid in its cache, so it sends out a Read message to
home. Home finds the block is in state EM and the sharing vector indicates that P3 has the block.
So it sends an intervention message to P3, and P3 responds by downgrading its cache state to shared
and flushes the block to both P1 and home. Home changes the directory state to shared, updates the
sharing vector to “101”, and uses the flushed block to update the block value in the local memory.
When P3 wants to read from the block, it finds it in its cache in a shared state. This results in a
cache hit, and no messages need to be sent.
i
i
i
i
i
i
i
i
10.4. IMPLEMENTATION CORRECTNESS AND PERFORMANCE 343
Finally, P2 wants to read the block. It sends a read request to home. The directory state at home
is shared, indicating that the block is clean. Hence, the home sends a ReplyD message containing
the block that it fetches from its local memory.
Let us now compare the basic directory coherence protocol versus a broadcast coherence pro-
tocol. First, it is clearly more complicated, with more message types than in a bus-based multipro-
cessor, the maintenance of a directory at home nodes, and possible protocol races when multiple
messages from different requests are overlapped. In addition, in a bus-based multiprocessor, all
transactions only incur at most two hops (the request and subsequent reply on the bus), while direc-
tory coherence sometimes requires three-hop transactions. In addition, cache-to-cache transfer that
can be employed in a bus-based multiprocessor cannot be applied with directory-based coherence.
Since all read or write misses need to go to home, the fastest way to supply data is when the home
has them in the local memory. In this case, no cache to cache transfer can be exploited.
10.4 Implementation Correctness and Performance
For our simple protocol we discussed in the previous section, we have assumed that (1) the directory
state and its sharing vector reflects the most up to date state in which the block is cached, and (2)
messages due to a request are processed atomically (without being overlapped with one another). In
real systems, both assumptions do not necessarily apply. This results in various protocol races that
need to be handled properly. Some of them can be handled relatively simply by the directory alone.
But in other cases, the directory alone cannot handle them correctly. The behavior of the cache
coherence controllers at each node must also be modified to aid the directory. We will now discuss
how various races can be handled. Note that it is not the purpose of this chapter to exhaustively
discuss all possible races. We will mostly focus on major races to give readers a flavor of the
complexity in handling them.
10.4.1 Handling Races Due to Out-of-Sync Directory State
The reason why the directory state and cache states can get out of sync is because they are not
updated in lock step, and some events in the caches are never seen by the directory state. For
example, a block may be replaced from a cache, and the corresponding node is no longer a sharer.
However, the directory does not see this event and still thinks that the node is a sharer of the block.
Several odd situations can occur due to the directory having an inconsistent view of the cache states.
Consider when the directory thinks that a node is a sharer of a block, caching the block in a
clean shared state. Let us call the node A. However, the block is replaced by the node A from its
cache silently. When the directory receives a read exclusive request to the block from another node,
it sends invalidation messages to all nodes that the directory thinks as sharers of the block, including
node A. This results in a situation in which a node receives an invalidation message to a block that
it no longer has in its cache. In this case, the correct response is to acknowledge the receipt of the
invalidation by sending an invalidation acknowledgment message to the requester. No harm is done
here since the intended final state for the block is invalid, and the block is already uncached (which
is similar to being invalidated).
Another situation is when the directory thinks a node is already a sharer of a block, but the
directory receives a read request from the node. This situation occurs because the node that kept the
block in its cache has silently evicted the block, while the directory still thinks the node is a sharer.
i
i
i
i
i
i
i
i
344 CHAPTER 10. ADVANCED CACHE COHERENCE ISSUES
When the node wants to read from the block, it suffers a read miss, and as a result, a read request
is sent to the directory. Handling this case is also simple. The directory can reply with data to the
requestor and keep the sharing bit vector unchanged, since the bit vector already reflects the node
as a sharer.
When the directory indicates that a block is exclusively cached (the directory state is EM),
similar odd situations can occur. For example, a Read or ReadX request may arrive at the directory
from a node that the directory thinks as caching the block exclusively (in exclusive or modified
state). In this case, apparently the node that owns the block has evicted the block. If the block was
clean, no write back (or flush) had occurred, whereas if the block was dirty, the flush message has
yet to reach the directory. The directory cannot just reply with data since it may not have the latest
data. In contrast to the case in which the directory state is shared, with an EM state, the directory
cannot reply with data just yet. However, it cannot just wait for the flushed block to arrive either
because it may never come (the block might be clean and was evicted silently by the requestor while
it was in the exclusive state). Therefore, this is not a problem that can be solved just by the protocol
at the directory alone. The coherence controller at each processor node must also participate. In
particular, when the processor evicts a dirty block and flushes the block to the main memory at
the home node, it must keep track of whether the write back has been completed or not, with a
structure called the outstanding transaction buffer. The home node must send an acknowledgment
upon the receipt of a flush message. In addition, the coherence controller at the processor side must
also delay Read or ReadX requests to a block that is being flushed, until it gets an acknowledgment
from the directory that the flushed block has been received. This way, the directory will never
see a Read/ReadX request to a block from a node that is still flushing it. Thus, when it receives
a Read/ReadX request, the directory knows that it must be the case that the block was clean and
evicted silently. Therefore, it can safely reply to the requestor with the data value of the block.
The finite state machine corresponding to the coherence protocol employed at the directory is
shown in Figure 10.4. The left hand side of the slash symbol (“/”) is the request received from
the processor by the directory, and the sender of the message (the requesting node) is shown in the
brackets. The right hand side of the slash is the messages sent by the directory as a response to
the request received from the processor. The destination of the message is shown in the brackets.
Only state transitions that are new compared to Figure 10.3 are labeled. The coherence action in the
directory that manipulates the sharing bit vector is not shown in the figure.
Note that because the directory may be out-of-sync with the cache states, there may be requests
arriving to the directory from the owner (i.e., the node that the directory thinks as caching a block
in exclusive or modified state). In order to distinguish requests from the current owner versus from
other processors, the EM state is split into two: EM
A
which indicates that the directory thinks that
the current owner is node A, and EM
B
which indicates that the directory thinks that the current
owner is node B. Nodes A and B can be any nodes as long as they are distinct from one another.
When the current state is EM
A
, how the protocol reacts depends on whether the request comes
from the current owner (A) or from other nodes (B). Likewise, when the current state is EM
B
,
how the protocol reacts depends on whether the request comes from the current owner (B) or from
other nodes (A). Since the EM state is now split into two, there are now two additional edges that
represent transitions between the two EM states.
Since the processing of requests from a current sharer or from non-sharer nodes is the same, the
shared state is not split into two.
i
i
i
i
i
i
i
i
10.4. IMPLEMENTATION CORRECTNESS AND PERFORMANCE 345
EMB
EMA
S
U
ReadX[A]/Reply[A],Inv[B]
ReadX[A]/ReplyD[A]
ReadX[B]/Reply[B],Inv[A]
Read[A]/ReplyD[A]
ReadX[B]/ReplyD[B]
Read[B]/ReplyD[B]
Read/ReplyD
Figure 10.4: Simplified directory state finite state coherence protocol diagram, with modifi-
cations to handle out-of-sync view of cache states at the directory.
In the shared (S) state, now it is possible to receive a read request from the node that the directory
thinks a sharer. Since the directory already recognizes the node as a sharer, it is correct to respond
with the data block through a ReplyD message.
Suppose the directory state is EM
A
(identical responses are produced when the state is EM
B
).
If a read request coming from a different node (B) is received, then the new state will be shared,
and an intervention request is sent to the owner of the block (A). The intervention request also
contains the ID of the requestor so that the owner knows where to send the block. If a read exclusive
request from another node (B) is received, then the directory sends an invalidation message to the
current owner (A), sends a reply message to B telling it to expect to receive a reply from A, and
ownership transfer is indicated by transitioning from state EM
A
to EM
B
. The invalidation request
also contains the ID of the requestor so that the owner knows where to send the block to. Note that
the block may be in state either exclusive or modified at the owner; the directory does not know. In
either case, upon receiving intervention or invalidation, the owner must respond appropriately.
Let us now discuss the new case that occurs due to the directory having an out-of-sync view of
the cache states. When the state is EM
A
, with an out-of-sync view of cache states, the directory
may receive a read or read exclusive request from node A. This is because node A suffers a read or
write miss after the clean block was silently evicted from its cache (if it had been dirty, the request
would have been stalled at the requestor and instead the directory would receive a flush message
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.122.46