Home Page Icon
Home Page
Table of Contents for
Index
Close
Index
by Bob Colwell, Tom Shanley
The Unabridged Pentium 4 IA32 Processor Genealogy
Copyright
At-a-Glance Table of Contents
Figures
Tables
Acknowledgments
About This Book
Introduction
Overview of the Processor Role
The IA32 Specification
IA32 Processors
IA32 Instructions vs. μops
Processor = Instruction Fetch/Decode/Execute Engine
Some Instructions Result in FSB Transactions
The Processor's Role in Today's Systems
System Overview
Single-/MultiTask OS Background
Single-Task OS and Application
Operating System Overview
Direct IO Access
Application Program Memory Usage
Task Initiation, Execution and Termination
Definition of Multitasking
Concept
An Example—Timeslicing
Another Example—Awaiting an Event
Multitasking Problems
OS Protects Territorial Integrity
Stay in Your Own Memory Area
IO Port Anarchy
Unauthorized Use of OS's Tools
No Interrupts, Please!
BIOS Calls
The 386
386 Real Mode Operation
Special Note
An Overview of the 386 Internal Architecture
An Overview of the 386DX FSB
The 386 Register Set
386 Power-Up State
Initial Memory Reads
IO Port Addressing
Memory Addressing
Real Mode Instructions and Registers
Real Mode Interrupt/Exception Handling
Protection in Real Mode
Protected Mode Introduction
General
Memory Protection
IO Protection
Privilege Levels
Virtual 8086 Mode
Task Switching
Interrupt Handling
Intro to Segmentation in Protected Mode
Special Note
Real Mode Limitations
Segment Descriptor Describes a Memory Area in Detail
Segment Register—Selects Descriptor Table and Entry
Introduction to the Descriptor Tables
General Segment Descriptor Format
Code Segments
Selecting the Code Segment to Execute
Code Segment Descriptor Format
Accessing the Code Segment
Privilege Checking
Calling a Procedure in the Current Task
Call Gate
Data and Stack Segments
A Note Regarding Stack Segments
The Data Segments
Selecting and Accessing a Stack Segment
Creating a Task
What Is a Task?
Basics of Task Creation and Startup
TSS Structure
TSS Descriptor
How the OS Starts a Task
What Happens When a Task Starts
Use of the LTR and STR Instructions
Mechanics of a Task Switch
Events that Initiate a Task Switch
Switch Via a TSS Descriptor
Task Gate Descriptor
Task Switch Details
Linked Tasks
Linkage Modification
The Busy Bit
Address Mapping
386 Demand Mode Paging
Problem—Loading Entire Task into Memory is Wasteful
Solution—Load Part and Keep Remainder on Disk
Problem—Running Two (or more) DOS Programs
Solution—Redirect Memory Accesses to Separate Memory Areas
Global Solution—Map Linear Address to Disk Address or to a Different Physical Memory Address
The Paging Unit Is the Translator
Three Possible Page Lookup Methods
IA32 Page Lookup Method
Enabling Paging
Page Directory and Page Tables
Finding the Location of a Physical Page
Eliminating the Directory Lookup
Checking Page Access Permission
Page Faults
Usage of the Dirty and Accessed Bits
Demand Mode Paging Evolution
The Flat Model
Segments Complicate Things
Paging Can Do It All
Eliminating Segmentation
The Privilege Check
The Read/Write Check
Each Task (including the OS) Has Its Own TSS
Interrupts and Exceptions
Special Note
General
Hardware Interrupts
Software-Generated Exceptions
Interrupt/Exception Priority
Real Mode Interrupt/Exception Handling
Protected Mode Interrupt/Exception Handling
Interrupt/Exception Handling in VM86 Mode
Exception Error Codes
The Resume Flag Prevents Multiple Debug Exceptions
Special Case—Interrupts Disabled While Updating SS:ESP
Detailed Description of the Software Exceptions
Virtual 8086 Mode
A Special Note
DOS Application—Portrait of an Anarchist
Solution—Set a Watchdog on the DOS Application
The Virtual Machine Monitor (VMM)
Entering or Reentering VM86 Mode
An Interrupt or Exception Causes an Exit From VM86 Mode
A Task Switch Causes an EFlags Update
DOS Task's Memory Usage
The Privilege Level of a VM86 Task
Restricting IO Accesses
IOPL-Sensitive Instructions
Interrupt/Exception Generation and Handling
Registers Accessible in Real/VM86 Mode
Instructions Usable in Real/VM86 Mode
VM86 Mode Evolution
The Debug Registers
The Debug Registers
486
Caching Overview
Definition of a Load and a Store
The Cache's Purpose
The Write-Through Cache
The Write Back Cache
Snooping
The Overall Cache Architecture
Cache Real Estate Management
A Unified Cache
Split Caches
Non-Blocking Caches
486 Hardware Overview
486 Flavors
An Overview of the 486 Internal Architecture
An Overview of the 486 FSB
A20 Mask
On-Chip Cache Added
486 Software Enhancements
FPU Added On-Die
Alignment Checking Feature
Paging-Related Changes
Caching-Related Changes to the Programming Environment
CR4 Was Added in the Later Models of the 486
Test Registers Added
Instruction Set Changes
New/Altered Exceptions
System Management Mode (SMM)
Pentium®
Pentium® Hardware Overview
Pentium® Flavors
An Overview of the Pentium® Internal Architecture
An Overview of the Pentium® FSB
The Caches
Local APIC Added in the P54C
Test Access Port (TAP)
FRC Mode
Soft Reset (INIT#)
Pentium® Software Enhancements
VM86 Extensions
Protected Mode Virtual Interrupts
Debug Extension
Time Stamp Counter
4MB Pages
Machine Check Architecture (MCA)
Performance Monitoring
Local APIC Register Set
Test Registers Relocated
MSRs Added
Instruction Set Changes
New/Altered Exceptions
Intro to the P6 Core and FSB
P6 Road Map
The P6 Processor Family
The Klamath Core
The Deschutes Core
The Katmai Core
P6 Hardware Overview
For More Detail
Introduction
The P6 Processor Core
The FSB Interface Unit
The Backside Bus (BSB) Interface Unit
The Unified L2 Cache
The L1 Data Cache
The L1 Code Cache
The Processor Core
The Local APIC Unit
Pentium® Pro Software Enhancements
Pentium® Pro Software Enhancements
Paging Enhancements
APIC Enhancements
MMX Not Implemented
SMM Enhancement
MTRRs Added
MCA Enhanced
The Performance Counters
MSRs Added
Instruction Set Changes
New/Altered Exceptions
MicroCode Update Feature
The Problem
The Solution
The Microcode Update Image
Matching the Image to a Processor
The Microcode Update Loader
Updates in a Multiprocessor System
The Image Management BIOS
When Must the Image Upload Take Place?
Determining if a New Update Supersedes a Previously-Loaded Update
Effect of RESET# Or INIT# on a Previously-Loaded Update
Pentium® II
Pentium® II Hardware Overview
The Pentium® Pro and Pentium® II: Same CPU, Different Package
Dual-Independent Bus Architecture (DIBA)
IOQ Depth
Pentium® Pro/Pentium® II Differences
One Product Yields Three Product Lines
The Pentium® II/Xeon/Celeron Roadmap
The Cartridge
The Core
The FSB and BSB
The Introduction of the Celeron
Miscellaneous Hardware Stuff
Pentium® II Power Management Features
The Pentium® Pro's Power Conservation Modes
The Pentium® II's Power Conservation Modes
The Normal State
The AutoHalt Power Down State
The Stop Grant State
The Halt/Grant Snoop State
The Sleep State
The Deep Sleep State
Pentium® II Software Enhancements
The Pentium® II and Pentium® III MSRs
Instruction Set Changes
New/Altered Exceptions
Pentium® II Xeon Features
Introduction
To Avoid Confusion...
Basic Characteristics
Hardware Characteristics
PSE-36 Mode
Pentium® III
Pentium® III Hardware Overview
One Product = Three Product Lines
Pentium® II/Pentium® III Differences
The Pentium® III/Xeon/Celeron Roadmap
IOQ Depth
The L1 Caches
The L2 Cache
The Data Prefetcher
SSE Introduced
The WCBs Were Enhanced
Additional Writeback Buffers
SpeedStep Technology
Pentium® III Software Enhancements
The Streaming SIMD Extensions (SSE)
CPUID Enhanced
Pentium® III Xeon Features
Basic Characteristics
PAT Feature (Page Attribute Table)
Pentium® 4
Pentium® 4 Road Map
The Roadmap
Pentium® 4 System Overview
General
The Graphics Adapter
Device Adapters
Snooping
Definition of a Cluster
Definition of the Boot Strap Processor
Starting up the Application Processors (the APs)
Pentium® 4 Processor Overview
The Pentium® 4 Processor Family
Pentium® III/Pentium® 4 Differences
Pentium® 4/Pentium® 4 Prescott Differences
Pentium® 4 Processor Basic Organization
The FSB is Tuned for Multiprocessing
Intro to the FSB Enhancements
IA Instructions Vary in Length and Are Complex
The Trace Cache
There Are Two Pipeline Sections
The μop Pipeline
The IA32 Data Register Set Was Small
Speculative Execution
Pentium® 4 PowerOn Configuration
Configuration on Trailing-Edge of Reset
Setup and Hold Time Requirements
Built-In Self-Test (BIST) Trigger
Assignment of IDs to the Processor
Error Observation Options
In-Order Queue Depth Selection
Power-On Restart Address
Tri-State Mode
Processor Core Speed Selection
Bus Parking Option
Hyper-Threading Option
Program-Accessible Startup Features
Pentium® 4 Processor Startup
Introduction
The Processor's State After Reset
EAX, EDX Content After Reset Removal
The Core Is Starving and Caching is Disabled
Boot Strap Processor (BSP) Selection
How the APs are Discovered and Configured
Pentium® 4 Core Description
One μop Doesn't Necessarily = One IA32 Instruction
Upstream vs. Downstream
Introduction
The Big Picture
The Front-End Pipeline Stages
Intro to the μop Pipeline
The μop Pipeline's Major Elements
Additional, Core-Specific Terms
Hyper-Threading
General
Background
The HT Approach
Overview of HT Resource Usage
HT and the Data TLB
HT and the FSB
The IOQ Depth Was Increased
HT Performance Issues
HT and Serializing Instructions
HT and the Microcode Update Feature
HT Cache-Related Issues
HT and the TLBs
HT and the Thermal Monitor Feature
HT and External Pin Usage
The Pentium® 4 Caches
A Cache Primer
The L0 Cache
Upstream vs. Downstream
Overview
Determining the Processor's Cache Sizes and Structures
Enabling/Disabling the Caches
The L1 Data Cache
The L2 ATC
The L3 Cache
FSB Transactions and the Caches
The Cache Management Instructions
Pentium® 4 Handling of Loads and Stores
The Memory Type Defines Load/Store Characteristics
Load μops
Store-to-Load Forwarding
Store μops
The MFENCE Instruction
Non-Temporal Stores
The Pentium® 4 Prescott
Introduction
Increased Pipeline Depth
Trace Cache Improvements
Increased Number of WCBs
L1 Data Cache Changes
Increased L2 Cache Size
Enhanced Branch Prediction
Store Forwarding Improved
SSE3 Instruction Set
Increased Elimination of Dependencies
Enhanced Shifter/Rotator
Integer Multiply Enhanced
Scheduler Enhancements
Fixed the MXCSR Serialization Problem
Data Prefetch Instruction Execution Enhanced
Improved the Hardware Data Prefetcher
Hyper-Threading Improved
Pentium® 4 FSB Electrical Characteristics
Introduction
The Bus and Processor Clocks
The Address and Data Strobes
The Voltage ID
Everything's Relative
Signals that Can Be Driven by Multiple FSB Agents
Minimum One BCLK Response Time
Intro to the Pentium® 4 FSB
Enhanced Mode Scaleable Bus
FSB Agents
Uniprocessor vs. Multiprocessor Bus
The Request Agent
The Transaction Phases
Transaction Pipelining
Transaction Tracking
Pentium® 4 CPU Arbitration
The Request Phase
Logical versus Physical Processors
The Discussion Assumes a Quad Xeon MP System
Symmetric Agent Arbitration—Democracy at Work
Pentium® 4 Priority Agent Arbitration
Priority Agent Arbitration
Pentium® 4 Locked Transaction Series
Introduction
The Shared Resource Concept
Testing the Availability of and Gaining Ownership of Shared Resources
A Race Condition Can Present a Problem
Guaranteeing the Atomicity of a Read/Modify/Write
Locking a Cache Line
Pentium® 4 FSB Blocking
Blocking New Requests—Stop! I'm Full!
Assert BNR# When One Entry Remains
BNR# Can Be Used by a Debug Tool
Who Monitors BNR#?
BNR# is a Shared Signal
The Stalled/Throttled/Free Indicator
BNR# Behavior at Powerup
BNR# Behavior During Runtime
Pentium® 4 FSB Request Phase
Cautionary Note
Introduction to the Request Phase
The Source Synchronous Strobes
The Request Phase Parity
Request Phase Parity Checking
The Request Phase Signal Group is Multiplexed
Introduction to the Transaction Types
The Contents of Request Packet A
The Contents of Request Packet B
Pentium® 4 FSB Snoop Phase
Agents Involved in the Snoop Phase
The Snoop Phase Has Two Purposes
The Snoop Result Signals are Shared, DEFER# Isn't
The Snoop Phase Duration Is Variable
There Is No Snoop Stall Duration Limit
Memory Transaction Snooping
Non-Memory Transactions Have a Snoop Phase
Pentium® 4 FSB Response and Data Phases
A Note on Deferred Transactions
The Purpose of the Response Phase
The Response Phase Signal Group
The Response Phase Start Point
The Response Phase End Point
The Response Types
The Response Phase May Complete a Transaction
The Data Phase Signal Group
Five Example Scenarios
Data Phase Wait States
The Response Phase Parity
Data Bus Parity
Pentium® 4 FSB Transaction Deferral
Example System Models
Example Multi-Cluster Model
The Problem
Possible Solutions
Example Read From a PCI Express Device
Example Write To a PCI Express Device
Pentium® 4 Support for Transaction Deferral
Pentium® 4 FSB IO Transactions
Introduction
The IO Address Range
The Data Transfer Length
Pentium® 4 FSB Central Agent Transactions
Point-to-Point vs. Broadcast
The Interrupt Acknowledge Transaction
The Special Transaction
The BTM Transaction Is Used for Program Debug
Pentium® 4 FSB Miscellaneous Signals
The Signals
Pentium® 4 Software Enhancements
The Foundation
Miscellaneous New Instructions
Enhanced CPUID Instruction
The SSE2 Instruction Set
The SSE3 Instruction Set
Local APIC Enhancements
The Thermal Monitoring Facilities
FPU Enhancement
The MSRs
The Machine Check Architecture
Last Branch, Interrupt, and Exception Recording
The Debug Store (DS) Mechanism
New Exceptions
The Performance Monitoring Facility
Pentium® 4 Xeon Features
General
The Pentium® 4 Xeon DP
The Pentium® 4 Xeon MP
Pentium® M
Pentium® M Processor
Background
The Pentium® M and Centrino
Characteristics Overview
The FSB Characteristics
Enhanced Power Management Characteristics
Three Different Packaging Models
Improved Thermal Monitor Mode
Enhanced Branch Prediction
μop Fusion
Advanced Stack Management
Miscellaneous
The Data Cache and Hyper-Threading
The Next Pentium® M
Additional Topics
CPU Identification
Prior to the Advent of the CPUID Instruction
Determining if the CPUID instruction Is Supported
General
Determining the Request Types Supported
The Basic Request Types
The Extended Request Types
Enhanced Processor Signature
System Management Mode (SMM)
What Falls Under the Heading of System Management?
The Genesis of SMM
SMM Has Its Own Private Memory Space
The Basic Elements of SMM
A Very Simple Example Scenario
How the Processor Knows the SM Memory Start Address
Protected Mode, Paging and PAE-36 Mode Are Disabled
The Organization of SM RAM
Entering SMM
Exiting SMM
Caching from SM Memory
Setting Up the SMI Handler in SM Memory
Relocating the SM RAM Base Address
SMM in an MP System
The Local and IO APICs
Before the Advent of the APIC
MP Systems Need a Better Interrupt Distribution Mechanism
A Short History of the APIC
Detecting the Presence and Version of the Local APIC
Enabling/Disabling the Local APIC
Local Cluster and APIC ID Assignment
An Introduction to the Interrupt Sources
Introduction to Interrupt Priority
An Intro to Edge-Triggered Interrupts
An Intro to Level-Sensitive Interrupts
The Local APIC Register Set
Locally Generated Interrupts
Task and Processor Priority
Interrupt Messages
The IO APIC
Message Signaled Interrupts (MSI)
Message Format
The Spurious Interrupt Vector
The Agents in an Interrupt Message Transaction
BSP Selection Process
The APIC, the MPS and ACPI
Acronyms
CD-ROM Warranty
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
CD-ROM Warranty
Index
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset