372 Computer Architecture and Organization
assumed that instruction fetching, its decoding, necessary operand fetch, execution of the instruction
and nally, result-storage, all are completed by consuming identical time-slices. What would happen if
these time-slices are uneven? Well, in that case all related operations would remain idle till the execu-
tion of the longest step. Remember that we have investigated the case of one car per two days when
all other operations except the body fabrication needed one day but due to the requirement of the body
fabrication stage (it needed two days) the whole process was delayed. Therefore, the pipeline would
only perform in an ef cient manner if all related operations consume more or less the same amount of
time. Let us now investigate the real-life situation, related to this assumption.
In general, the operands of related instructions are available within the internal registers of the pro-
cessor. Similarly, the results of different instructions are also stored within the internal registers of the
processor. Therefore, we see that the last four operations namely decode, operand fetch, execute and
result storage may be taken as internal operations of the processor and consume more or less same time.
However, the very rst part, the instruction fetch must be carried out from external memory source.
From our discussions of Chapter 7, we know that this is the slowest process and may consume ten times
more time than any internal operation of the processor.
In such a situation, our cache memory comes into rescue. In case of L1 cache (the cache that is
located within the same wafer of the processor), its access time is more or less same as that of the access
time for processor registers or processor operations (add two numbers for example). Therefore, most of
the modern processors, which implement the pipeline strategy, are equipped with on-chip L1 cache to
speed up the instruction fetch cycle.
The reader may point out that what would be the situation during a cache miss? Well, that can always
happen but it would not be a frequent one. This and some other issues are labelled as hazards of pipeline ,
which we shall discuss now.
12.3 PIPELINE PERFORMANCE
In Sections 12.2.1 and 12.2.2, we have considered an ideal environment for explaining the basic prin-
ciples of pipeline strategy. However, considering the real-life situations, we nd that there are quite a
few circumstances, where many of our simpli ed assumptions would not hold good. In following sec-
tions, we shall try to identify those situations and describe the related problems. Simultaneously, we
shall discuss some of the methods to overcome these problems.
12.3.1 Stalling
Stalling is a generalized term for pipeline hazard. Whenever the processor is not able to complete
its assigned cycle within the prede ned time-slice, it demands extra time, which forces other related
cycles to wait idle. This is a general condition designated as stalling. For example, let us reconsider the
example illustrated in Figure 12.4 , with a minor variation.
Let us assume that during its execution, whatsoever be the reason, execution of instruction 3 was
not completed within its assigned time-slice t6 . It takes an extra time-slice t7 , and then it is com-
plete, as illustrated in Figure 12.5 . The reason may be an I/O wait for some operand or the instruc-
tion itself might be such (like multiply or divide) that it was not possible to complete the execution
within t6 . Therefore, all other cycles related to other following instructions must remain idle (inac-
tive) during t7 for completion of execution of instruction 3. This may be referred as the processor
is stalling during t7 .
M12_GHOS1557_01_SE_C12.indd 372M12_GHOS1557_01_SE_C12.indd 372 4/29/11 5:24 PM4/29/11 5:24 PM