Chapter 7 - Basic Cache Coherence Issues (12/12)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

260 CHAPTER 7. BASIC CACHE COHERENCE ISSUES

Write2 I M I BusRdX/Flush P1.C 90

Write3 I I M BusRdX/Flush P2.C 90

TOTAL 544

Stream 3 (MESI):

Processor P1 P2 P3 Bus Action Data From Cycles

Read1 E - - BusRd(!C) Mem 90

Read2 S S - BusRd(C)/FlushOpt P1.C 90

Read3 S S S BusRd(C)/FlushOpt P1/2.C 90

Read3 S S S - - 1

Write1 M I I BusUpgr - 60

Write1 M I I - - 1

Write2 I M I BusRdX/Flush P1.C 90

Write3 I I M BusRdX/Flush P2.C 90

TOTAL 514

Stream 3 (MOESI):

Processor P1 P2 P3 Bus Action Data From Cycles

Read1 E - - BusRd(!C) Mem 90

Read2 S S - BusRd(C)/FlushOpt P1.C 90

Read3 S S S BusRd(C) Mem 90

Read3 S S S - - 1

Write1 M I I BusUpgr - 60

Write1 M I I - - 1

Write2 I M I BusRdX/Flush P1.C 90

Write3 I I M BusRdX/Flush P2.C 90

TOTAL 514

2. Dragon Protocol

For each of the memory reference streams given in the following, compare the cost of ex-

ecuting it on a bus-based machine that supports Dragon protocol. All the references (read-

/write) in the streams are to the same location, and the digit refers to the processor issuing

the reference. Assume that all caches are initially empty, and use the following cost model: a

read/write cache hit takes 1 cycle; a read/write cache miss which does not involve transferring

a whole block on the bus takes 60 cycles to complete; and a read/write miss which involves

BusUpd. Assume all caches are write back and write allocated.

7.6. EXERCISES 261

The ﬁrst stream is: R1 (read by processor 1), W1 (write by processor 1), R1, W1, R2, W2,

R2, W2, R3, W3, R3, W3.

The second stream is: R1, R2, R3, W1, W2, W3, R1, R2, R3, W3, W1.

The third stream is: R1, R2, R3, R3, W1, W1, W1, W1, W2, W3.

Answer:

Stream 1 (Dragon):

Processor P1 P2 P3 Bus Action Data From Cycles

Read1 E - - BusRd(!C) Mem 90

Write1 M - - - - 1

Read1 M - - - - 1

Write1 M - - - - 1

Read2 Sm Sc - BusRd(C)/Flush P1.C 90

Write2 Sc Sm - BusUpd(C) - 60

Read2 Sc Sm - - - 1

Write2 Sc Sm - BusUpd(C) - 60

Read3 Sc Sm Sc BusRd(C)/Flush P2.C 90

Write3 Sc Sc Sm BusUpd(C) - 60

Read3 Sc Sc Sm - - 1

Write3 Sc Sc Sm BusUpd(C) - 60

TOTAL 515

Stream 2 (Dragon):

Processor P1 P2 P3 Bus Action Data From Cycles

Read1 E - - BusRd(!C) Mem 90

Read2 Sc Sc - BusRd(C) Mem 90

Read3 Sc Sc Sc BusRd(C) Mem 90

Write1 Sm Sc Sc BusUpd(C) - 60

Write2 Sc Sm Sc BusUpd(C) - 60

Write3 Sc Sc Sm BusUpd(C) - 60

Read1 Sc Sc Sm - - 1

Read2 Sc Sc Sm - - 1

Read3 Sc Sc Sm - - 1

Write3 Sc Sc Sm BusUpd(C) - 60

Write1 Sm Sc Sc BusUpd(C) - 60

TOTAL 573

Stream 3 (Dragon):

Processor P1 P2 P3 Bus Action Data From Cycles

Read1 E - - BusRd(!C) Mem 90

Read2 Sc Sc - BusRd(C) Mem 90

262 CHAPTER 7. BASIC CACHE COHERENCE ISSUES

Read3 Sc Sc Sc BusRd(C) Mem 90

Read3 Sc Sc Sc - - 1

Write1 Sm Sc Sc BusUpd(C) - 60

Write2 Sc Sm Sc BusUpd(C) - 60

Write3 Sc Sc Sm BusUpd(C) - 60

TOTAL 631

Homework Problems

1. Suppose a 4-processor system uses a bus-based broadcast/snoopy coherence protocol. As-

sume that the cost of processing a protocol transaction depends on where data is supplied

and whether it involves a bus transaction. A cache hit not involving a bus transaction costs 1

cycle, a cache hit involving a bus transaction costs 10 cycles. A cache miss with data supplied

by another cache costs 50 cycles. A cache miss with data supplied by the main memory costs

80 cycles. The caches uses MSI coherence protocol. Show what happens to cache states after

each memory reference. The memory reference stream is: R1 (read by processor 1), R2, R3,

R2, W3 (write by processor 3), E3 (eviction by processor 3), W4, R2, R3, W1.

2. Repeat homework problem (1) for MESI protocol.

3. Repeat homework problem (1) for MOESI protocol.

4. Repeat homework problem (1) for Dragon protocol. Additionally, assume that any memory

reference that involves a BusUpd transaction requires 50 clock cycles to complete.

5. Some systems have implemented MOSI protocol to allow dirty sharing, which consists of

four states: modiﬁed, owned, shared, and invalid. The meaning of the states are:

• Modiﬁed: data is dirty and can only be kept in one cache

• Owned: data is dirty and may be kept in multiple caches, but only one of them has it in

the owned state

• Shared: data may be clean/dirty, and may be kept in multiple caches

• Invalid: data is uncached or invalid

Assume a bus-based multiprocessor with write back caches. The following bus transactions

are available: BusRd, BusRdX, BusUpgr, and Flush. FlushOpt is not implemented. Show

the state transition diagram for processor-initiated requests, and state transition diagram for

snooped bus requests.

6. Compare the latencies of MSI and MESI for the following cases: (a) read latency for data

that was earlier read by one other processor, (b) write latency for data that is read then written

by only one processor, and (3) read latency for data that is written by one processor and then

read by another processor.

7.6. EXERCISES 263

7. Compare the latencies of MESI and MOESI for the following cases: (a) read latency for data

that was earlier read by one other processor, (b) write latency for data that is read then written

by only one processor, (c) read latency for data that is written by one processor then read by

another processor, (d) read latency for clean data kept in other caches, and (e) bandwidth to

the main memory for a read to data recently written by another processor.

8. Non-Silent Replacement Suppose that the MESI protocol is modiﬁed so that the replacement

of a clean block is no longer silent, i.e., it is broadcasted on the bus. This way a cache can

detect if it becomes the only cache that holds a copy of the block, and can transition the state

of the block from shared (S) to exclusive (E). Explain the advantages and drawbacks of this

modiﬁcation. Draw the cache coherence protocol diagram for the modiﬁed MESI protocol.

9. MOESI Problem In the MOESI protocol, if a block in the owned (O) is evicted from the

cache, the block is written back to the main memory, even though there may be other caches

which hold the block in a shared (S) state. This write back incurs an unnecessary off-chip

bandwidth consumption. Propose a solution to avoid having to write back the block, or at

least reduce the frequency of such write backs. Draw the cache coherence protocol diagram

for your solution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 7 - Basic Cache Coherence Issues (12/12)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 7 - Basic Cache Coherence Issues (12/12)