An Overview of the 386DX FSB

Figure 5-2 on page 44 illustrates the address- and data-related signals on the 386DX processor's FSB. Although the 386 processor implemented a full 32-bit internal address bus, the two least-significant address lines, A[1:0], were not implemented as output pins on the FSB.

Figure 5-2. 386DX FSB


Address Bus Selects Dword

Whenever the processor initiated a transaction on the FSB, logic external to the processor behaved as if the least-significant two address lines are always zero. As a result, the processor could only output addresses divisible by four. As an example, it could address location 00000100h, but not 00000101h, 00000102h, or 00000103. In other words, the address output on A[31:2] selected a dword (i.e., a group of four locations starting at an address divisible by four) in either memory or IO address space (as defined by the transaction type).

Byte Enables Select Location(s) in Dword

In addition, the processor implemented four output pins designated as Byte Enable (BE) pins 3:0 (BE[3:0]#). The dword selector address is output on A[31:2] and the Byte Enable pins asserted by the processor indicate which of the four locations within the currently addressed dword are being selected for a read or a write (as defined by the transaction type). Refer to Table 5-1.

Table 5-1. 386 Byte Enables
Byte Enable AssertedDescription
BE0#When asserted, indicates that location zero in the selected dword is being addressed and that the byte to be read or written will be transferred over data path 0 (D[7:0]).
BE1#When asserted, indicates that location one in the selected dword is being addressed and that the byte to be read or written will be transferred over data path 1 (D[15:8]).
BE2#When asserted, indicates that location two in the selected dword is being addressed and that the byte to be read or written will be transferred over data path 2 (D[23:16]).
BE3#When asserted, indicates that location three in the selected dword is being addressed and that the byte to be read or written will be transferred over data path 3 (D[31:24]).

Misaligned Transfers Affect Performance

It should be obvious that, in a single transaction, the processor can only address a single dword in which to perform a read or write. Consider the following example:

mov             eax,[0101]

When executed, this instruction causes the processor to load the 32-bit EAX register with the four bytes from memory locations 00000101h through 00000104h. These are the last three locations in the dword that starts at 00000100h and the first location in the dword that starts at location 00000104h. In order to read these four locations, the processor must:

  • Perform a memory data read transaction from the dword starting at location 00000100h. It asserts BE1#, BE2# and BE3#, indicating a read from locations 00000101h through 00000103h.

  • Perform a memory data read transaction from the dword starting at location 00000104h. It asserts BE0# indicating a read from location 00000104h.

This scenario came about because the programmer (or the compiler) did not pay attention to alignment when this 32-bit data object was created in memory. Because it straddles two dwords, the processor must perform two transactions on its FSB whenever it must read or update this data object. This will negatively affect performance. The 386 processor did not provide the ability to flag this condition to the programmer as something that should be fixed in order to optimize execution speed. Starting with the 486 processor, all IA32 processors implement a mechanism to flag this condition (refer to “Alignment Check Exception (17)” on page 321).

Alignment Is Important!

As indicated in the previous section, misalignment of multi-byte data objects in memory can negatively affect performance. This is true in all IA32 processor implementations. If a multi-byte data object straddles a dword address boundary, it may also:

  • straddle a cache line boundary. In a post-386 processor, this may result in a double cache miss causing the processor to perform two full cache line reads on its FSB. Not only is this time consuming for the processor that experienced the double miss, but it consumes FSB bandwidth making the FSB less available to other entities on the FSB.

  • straddle a page address boundary. This could result in a double Page Fault. In order to obtain the multi-byte data object from memory, the OS Page Fault exception handler would have to read two full pages (4KB each) from disk into memory. Only then could the multi-byte data object be read from memory. See “386 Demand Mode Paging” on page 209 for a detailed description of Paging.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.173.53