i
i
i
i
i
i
i
i
202 CHAPTER 6. INTRODUCTION TO SHARED MEMORY MULTIPROCESSORS
6.4 Exercises
Worked Problems
1. Register allocation. Suppose that you are writing a compiler algorithm that decides what
variable can be allocated in a register. A register allocation allows the variable to be accessed
in the register without involving a load or store to read from or write to it. In order not to create
cache coherence problem, which of the following types of variables can be register allocated
vs. cannot be register allocated: read-only, read/write non-conflicting, and r/w conflicting?
Explain. (see Section 3.6 for what each variable type refers to).
Answer:
Variable type Can be register allocated?
Read-only Yes
R/W Non-Conflicting Yes
R/W Conflicting No
R/W conflicting variables are written and read by different threads. Hence, they cannot be
register allocated because a write by a processor to its registers is not visible to other threads,
hence write propagation no longer occurs.
2. Cache coherence. Assume that we have a two-processor system (P1 and P2). These pro-
cessors run different threads (T1 and T2, respectively) that access a data block containing
variable sum, in the following sequence in time:
T1: prefetch the block containing sum from &sum (i.e., the location of sum)
T2: read the value of sum from &sum
T2: add 7 to sum and write into &sum
T1: read the value of sum from &sum
T1: add 3 to sum and write into &sum
T1: read sum to print its value
Each access may hit or miss in the cache, depending on the cache configuration that will be
specified. For each situation below, determine if there is a correctness issue. Also give the
final values in all caches and in the main memory.
(a) The two processors P1 and P2 share an only single level cache (the L1 cache) that uses
a write back policy.
(b) Each processor has its own single-level write through cache. Does your answer depend
on whether the cache has a write allocate or write no-allocate policy?
(c) Each processor has its own L1 cache that uses evict-on-write policy, i.e., a block is
replaced immediately after it is written to. There are no other levels of caches in the
system.
(d) Each processor has a cache, but the block containing sum is not allowed to be cached.