The Cache Management Instructions

The instructions described in Table 40-11 on page 1057 can be used to manage the processor caches.

Table 40-11. Cache Management Instructions
InstructionDescription
CLFLUSHCache Line Flush (introduced in the Pentium® 4):
  • When executed, the processor uses the linear address of the specified one-byte memory operand and performs a lookup in all of the processor's caches (including the Trace Cache). This instruction has no effect if the line is not in any of the caches. The line is invalidated in any cache that contains it. However, if the line is in the modified state, it is also written back to system memory.

  • The memory type (as specified by the MTRRs and the PTE or PDE) does not affect this instruction in any way.

  • A processor's support for this instruction is determined by executing a CPUID request type 1. The instruction is supported if EDX[19] = 1. In addition, the cache line size (in qwords) is returned in EBX[15:8].

  • Although this instruction was first introduced in the Pentium® 4 as part of the SSE2 instruction set, a future IA32 processor could support SSE2 and CLFLUSH, CLFLUSH but not SSE2, or neither of them (although the author thinks it highly unlikely).

  • Because speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to the PREFETCHh instructions nor any of the speculative fetching mechanisms (i.e., data can be speculatively loaded into a cache just before, during, or after the execution of a CLFLUSH instruction that references the cache line).

  • CLFLUSH is only ordered by the MFENCE instruction and is not guaranteed to be ordered by any other fencing or serializing instructions (see “Serializing Instructions” on page 1079) or by another CLFLUSH instruction.

  • CLFLUSH can be executed at any privilege level and is subject to all permission checking and faults associated with a byte load.

  • CLFLUSH can flush a line that was read from an execute-only segment.

  • The execution of CLFLUSH sets the Accessed bit but not the Dirty bit in the PTE or PTE.

INVDInvalidate Caches (introduced in the 486):
  • When executed, causes all lines in all processor caches to be invalidated (including modified lines without writing them back to system memory).

  • The processor also performs the Special transaction on its FSB and outputs the Flush message on its Byte Enable outputs. This instructs an external cache (if there is one) to invalidate all of its lines as well.

  • It has no effect on the caches within other physical processors.

  • It can only be executed by code running at privilege level 0.

  • How future processors may handle this instruction will be processor design-specific.

MASKMOVDQUMasked Move of a Double Qword (16 bytes or 128 bits) to memory (introduced in the Pentium® 4). A full description of this instruction can be found in “The MASKMOVDQU Instruction” on page 1330. The following points address the effect on the caches:
  • When executing this instruction, the memory targeted is treated as WC memory (even if it's designated as WB, WT, or WP by the MTRRs and the PTE or PDE).

  • The only exception is if the memory region is designated as UC memory. In that case, the write is treated as a write to uncacheable memory rather than to WC memory.

  • The bytes to be written to memory are posted in one of the processor's WCBs and will be written to memory at a later time.

  • Using the memory address of the move, a lookup is also performed in the processor's caches.

  • If there is a hit in any of the caches, the line is evicted from the cache. The Intel® documentation does not specify what action is taken if the line is in the L2 or L3 Cache in the modified state. It is the author's opinion that it is written back to memory.

  • It is possible that, if the line being written to is present in a processor cache, a processor implementation might update the bytes within the cache line. To ensure that this doesn't happen, Intel® recommends that the memory area being written to should be defined as WC memory (in the MTRRs and the PTE or PDE). If the memory were defined as WB or WT memory, the processor could speculatively read a line into the cache, the MASKMOVDQU would update the bytes in the cache line (and not in a WCB).

MASKMOVQMasked Move of a Qword to memory (introduced in the Pentium® III). A full description of this instruction can be found in “The MASKMOVQ Instruction” on page 782. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
MOVNTDQStore Double Qword to memory (introduced in the Pentium® 4). A full description of this instruction can be found in “The MOVNTDQ Instruction” on page 1327. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
MOVNTIStore Dword to memory (introduced in the Pentium® 4). A full description of this instruction can be found in “The MOVNTI Instruction” on page 1329. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
MOVNTPDStore Packed DP FP values to memory (introduced in the Pentium® 4). A full description of this instruction can be found in “The MOVNTPD Instruction” on page 1328. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
MOVNTPSStore Packed SP FP values to memory (introduced in the Pentium® III). A full description of this instruction can be found in “The MOVNTPS Instruction” on page 780. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
MOVNTQStore Qword to memory (introduced in the Pentium® III). A full description of this instruction can be found in “The MOVNTQ Instruction” on page 781. See the description of the MASKMOVDQU instruction in this table for a description of this instruction's effect on the caches.
PREFETCHhThe Line Prefetch instruction (first introduced in the Pentium® III processor). A complete description of this instruction can be found in “Overlapping Data Prefetch with Program Execution” on page 773.
WBINVDWrite Back and Invalidate caches (first introduced in the 486). A complete description of this instruction can be found in “Write Back and Invalidate (WBINVD)” on page 458.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.181.186