Home Page Icon
Home Page
Table of Contents for
Part 3: Optimization Techniques
Close
Part 3: Optimization Techniques
by Jim Kukunas
Power and Performance
Cover image
Title page
Table of Contents
Copyright
Dedication
Introduction
Performance Apologetic
A Word on Premature Optimization
The Roadmap
Part 1: Background Knowledge
Chapter 1: Early Intel® Architecture
Abstract
1.1 Intel® 8086
1.2 Intel® 8087
1.3 Intel® 80286 and 80287
1.4 Intel® 80386 and 80387
Chapter 2: Intel® Pentium® Processors
Abstract
2.1 Intel® Pentium®
2.2 Intel® Pentium® Pro
2.3 Intel® Pentium® 4
Chapter 3: Intel® Core™ Processors
Abstract
3.1 Intel® Pentium® M
3.2 Second Generation Intel® Core™ Processor Family
Chapter 4: Performance Workflow
Abstract
4.1 Step 0: Defining the Problem
4.2 Step 1: Determine the Source of the Problem
4.3 Step 2: Determine Whether the Bottleneck Can Be Avoided
4.4 Step 3: Design a Reproducible Experiment
4.5 Step 4: Check Upstream
4.6 Step 5: Algorithmic Improvement
4.7 Step 6: Architectural Tuning
4.8 Step 7: Testing
4.9 Step 8: Performance Regression Testing
Chapter 5: Designing Experiments
Abstract
5.1 Choosing a Metric
5.2 Dealing with External Variables
5.3 Timing
5.4 Phoronix Test Suite
Part 2: Monitors
Chapter 6: Introduction to Profiling
Abstract
6.1 PMU
6.2 Top-Down Hierarchical Analysis
Chapter 7: Intel® VTune™ Amplifier XE
Abstract
7.1 Installation and Configuration
7.2 Data Collection and Reporting
Chapter 8: Perf
Abstract
8.1 Event Infrastructure
8.2 Perf Tool
Chapter 9: Ftrace
Abstract
9.1 DebugFS
9.2 Kernel Shark
Chapter 10: GPU Profiling Tools
Abstract
10.1 Traditional Graphics Stack
10.2 buGLe
10.3 Apitrace
Chapter 11: Other Helpful Tools
Abstract
11.1 GNU Profiler
11.2 Gcov
11.3 PowerTOP
11.4 LatencyTOP
11.5 Sysprof
Part 3: Optimization Techniques
Chapter 12: Toolchain Primer
Abstract
12.1 Compiler Flags
12.2 ELF and the x86/x86_64 ABIs
12.3 CPU Dispatch
12.4 Coding Style
12.5 x86 Unleashed
Chapter 13: Branching
Abstract
13.1 Avoiding Branches
13.2 Improving Prediction
Chapter 14: Optimizing Cache Usage
Abstract
14.1 Processor Cache Organization
14.2 Querying Cache Topology
14.3 Prefetch
14.4 Improving Locality
Chapter 15: Exploiting Parallelism
Abstract
15.1 SIMD
Chapter 16: Special Instructions
Abstract
16.1 Intel® Advanced Encryption Standard New Instructions (AES-NI)
16.2 PCLMUL-Packed Carry-Less Multiplication
16.3 CRC32
16.4 SSE4.2 String Functions
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Chapter 11: Other Helpful Tools
Next
Next Chapter
Chapter 12: Toolchain Primer
Part 3
Optimization Techniques
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset