A
Advanced Microcontroller Bus Architecture bus.
See AMBA bus
Application programmer interface,
131–132
Application programming interface,
369
barrel shift used with,
55
data processing instructions processed in,
51
digital signal processing vs.,
269
conditional execution of,
coprocessors attached to,
36–37
operating systems for,
14–15
read-allocate policy,
422
digital signal processing on,
270–272
read-allocate policy,
422
digital signal processing on,
275–277
Newton-Raphson division routines on,
217
digital signal processing on,
272–274
unsigned 64-bit by 64-bit multiply with 128-bit result,
210
pipeline length in,
31–32
read-allocate policy,
422
digital signal processing on,
277–278
ARM High Performance Bus,
ARM Procedure Call Standard,
122
ARM-Thumb interworking,
90–92
ARM-Thumb Procedure Call Standard
multiply instructions,
81–82
complex arithmetic support,
554–555
cryptographic multiplication extensions,
559
mixed-endianness support,
560
most significant word multiplies,
558–559
multiprocessing synchronization primitives,
560–562
packing instructions,
554
reverse instructions in,
561f
single instruction multiple data arithmetic operations,
550–554
sum of absolute differences instructions,
556–557
names allocated to variables,
172
B
Background regions, for memory protection units,
464–465
arithmetic instructions with,
55
arithmetic logic unit and,
51f
data processing instructions that do not use,
51
Base address register,
61
fixed-width bit-field packing and unpacking,
191–192
fixed-width bit-field packing and unpacking,
191–192
Block finite impulse response filters,
282–294
Block-floating algorithms,
149
Block-floating representation of digital signal,
263
Branch exchange with link,
60
C
with fixed number of iterations,
113–116
with variable number of iterations,
116–117
in Intel XScale SA-110 and Intel StrongARM cores,
435–438
efficiency measurements,
417
main memory and, relationship between,
410–412
replacement policy of,
419
by incrementing the way index,
445–449
allocation policy on a cache miss,
422
cache line replacement policies,
419–422
Command line interpreter,
369
Common object file format,
370
Common subexpression elimination,
127
Comparison instructions,
56–57
Complex instruction set computer.
See CISC
Conditional branch instruction,
92
Conditional instructions,
170
Content addressable memory,
414
page table activation,
497
replacement policy of,
419
instruction syntax,
77–78
memory management unit configuration and,
513–515
tightly coupled memory,
35,
36f
Cryptographic multiplication extensions,
559
Current program status register
conditional execution,
29,
29t
state instruction sets,
26–27
Cyclic redundancy check,
107
D
Data encryption standard permutation,
249t
Data processing instructions
arithmetic instructions,
53–55
comparison instructions,
56–57
logical instructions,
55–56
multiply instructions,
57–58
Thumb instruction set,
93–95
in Intel XScale SA-110 and Intel StrongARM cores,
435–438
Decimation-in-time radix-2 butterfly,
304
Digital signal processing
complex arithmetic support,
554–555
cryptographic multiplication extensions,
559
dual 16-bit multiply instructions,
557–558
most significant word multiplies,
558–559
packing instructions,
554
single instruction multiple data arithmetic operations,
550–554
sum of absolute differences instructions,
556–557
discrete Fourier transform
finite impulse response filters
fixed-point representation signals
operating on values stored in,
264
floating-point representation signal,
262,
268
infinite impulse response filters,
294–302
load-store intensive,
259
representation of digital signal
Digital signal processor,
Disable_lower_priority routine,
362
Discrete Fourier transform definition of,
303
conversion into multiplies,
143–145
fixed-point representation signal,
267
initial estimate for,
231
integer normalization for,
212
Q15 fixed-point division by,
233–235
Q31 fixed-point division by,
235–237
unsigned 32/32-bit divide by,
225–230
repeated unsigned division with remainder,
142–143
unsigned 64/31-bit divide by,
222–223
unsigned 32-bit/15-bit divide by,
220–222
unsigned 32-bit/32-bit divide by,
218–220
fast context switch extension use of,
518–519
Double-precision integer multiplication
signed 64-bit by 64-bit multiply with 128-bit result,
211–212
unsigned 64-bit by 64-bit multiply with 128-bit result,
209–210
Dual 16-bit multiply instructions,
557–558
Dynamic random access memory.
See DRAM
F
Fast context switch extension
schematic diagram of,
517f
virtual addresses modified by,
516
Fast interrupt mode,
23,
26t
Fast interrupt request vector,
34
Finite impulse response filter
interactive functions,
369
Fixed-point algorithm,
149
Fixed-point representation of digital signal
operating on values stored in,
264
Fixed-width bit-field packing and unpacking,
191–192
Flash ROM filing system,
369
Floating point accelerator,
149
Floating-point representation of digital signal,
262,
268
Fractional value division, by Newton-Raphson iteration
initial estimate for,
231
Fully associative cache,
414
Function call overhead,
125
I
Immediate postindex,
63,
64t
Infinite impulse response filters,
294–302
Initialization code,
12–14
barrel shift used with,
55
arithmetic instructions,
53–55
comparison instructions,
56–57
logical instructions,
55–56
multiply instructions,
57–58
Thumb instruction set,
93–95
program status registers,
75–76
single-register load-store
Thumb instruction set,
96–97
sum of absolute differences,
556–557
Instruction cycle timings
branch instructions,
58–60
conditional execution,
82–84
data processing instructions
arithmetic instructions,
53–55
comparison instructions,
56–57
logical instructions,
55–56
multiply instructions,
57–58
single-register load-store addressing modes,
61–63
single-register transfer,
60–61
program status register instructions,
75–76
software interrupt instruction,
73–75
ARM-Thumb interworking,
90–92
branch instructions,
92–93
data processing instructions,
93–95
load and store offsets,
132t
multiple-register load-store instructions,
97–98
single-register load-store instructions,
96–97
software interrupt instruction,
99
stack instructions,
98–99
double-precision multiplication
signed 64-bit by 64-bit multiply with 128-bit result,
211–212
unsigned 64-bit by 64-bit multiply with 128-bit result,
209–210
digital signal processing on,
278–280
Interrupt controller registers,
349t
Interrupt controllers,
12
Interrupt handling schemes,
317
stack design and implementation,
329–333
Interrupt request vector,
33
Inverted logical relations,
183
L
L1 translation table base address,
503–504
Left shifts, saturation of,
253–254
Load instructions scheduling
Load-store architecture, ,
19–20
single-register load-store
Thumb instruction set,
96–97
single-register transfer,
60–61
Locality of reference,
407,
457
Lock bits, for cache lockdown,
450–453
Logarithmic representation of digital signal,
263
Logical instructions,
55–56
with fixed number of iterations,
113–116
with variable number of iterations,
116–117
M
Machine independent layer,
370
MAP (alias
),
630
dynamic random access.
See DRAM
fetching instructions for,
10t
cache and, relationship between,
410–412
static random access.
See SRAM
synchronous dynamic random access.
See DRAM
fast context switch extension
schematic diagram of,
517f
virtual addresses modified by,
516
context switch activation of,
497
L1 translation table base address,
503–504
simple little operating system,
545
translation lookaside buffer
single-step page table walk,
507–508
memory map for assigning regions,
479–481
Mixed-endianness support,
560
Modified virtual address,
516
Most significant word multiplies,
558–559
Multiple-register transfer
Thumb instruction set,
97–98
signed 64-bit by 64-bit multiply with 128-bit result,
211–212
unsigned 64-bit by 64-bit multiply with 128-bit result,
209–210
repeated divisions converted into,
143–145
Multiply instructions,
57–58
Multiply-accumulate unit,
20
Multiprocessing synchronization primitives,
560–562
P
of variable-width bitstreams,
192–194
context switch activation of,
497
demonstration of, in virtual memory system
filling of, with translations,
531–538
initializing of, in memory,
529–531
fast context switch extension use of,
518–519
L1 translation table base address,
503–504
Page table control block,
527
Peripheral component interconnect bus,
interrupt controllers,
12
executing characteristics,
31–32
schematic diagram of,
30f
Platform operating systems,
14
Prefetch abort vector,
33
Preindex with writeback,
62
double-precision integer multiplication description of,
208
signed 64-bit by 64-bit multiply with 128-bit result,
211–212
unsigned 64-bit by 64-bit multiply with 128-bit result,
209–210
multiprocessing synchronization,
560–562
Prioritized direct interrupt handler,
333,
356–359
Prioritized simple interrupt handler,
333,
346–352
Prioritized standard interrupt handler,
333,
352–356
Process control block,
385
schematic diagram of,
23f
Protected regions, for memory protection units
Pseudoinstructions,
78–79
Pseudorandom numbers,
255
Pseudorandom replacement,
419,
458
R
Radix-2 fast Fourier transform,
304–305
Radix-4 fast Fourier transform,
305–313
Random number generation,
255
Real-time operating systems,
14
schematic diagram of,
23f
maximizing the available registers,
177–180
allocation to register numbers,
171–175
more than 14 local variables,
175–177
Register postindex,
63,
64t
Repeated divisions converted into multiplications,
143–145
Repeated unsigned division with remainder,
142–143
Reverse subtract instruction,
54
Right shift, rounded,
254,
264
Round-robin algorithm,
383
Round-robin replacement,
419
S
hardware initialization,
375,
377
Saturated arithmetic,
80–81
32-bit addition and subtraction,
254
Saturation instructions,
81t
Scaled register postindex,
63
Scheduling of instructions
Signed 64-bit by 64-bit multiply with 128-bit result,
211–212
Signed division by a constant,
147–149
Simple little operating system
memory management unit,
545
memory protection units,
487
Single instruction multiple data arithmetic operations,
550–554
Single issue multiple data processing,
178
Single-register load-store instructions
Thumb instruction set,
96–97
Single-register transfer,
60–61
SMLAL multiply instruction,
57–58
Software interrupt exception,
321
Software interrupt instruction
Software Interrupt vector,
33
fixed-point representation signal,
267–268
by Newton-Raphson iteration,
240–250
Static random access memory.
See SRAM
digital signal processing on,
274–275
StrongARM1 instruction cycle timings,
655–656
Sum of absolute differences instructions,
556–557
Supervisor mode stack,
332
Swapped out variables,
120
System control coprocessor,
77
System-on-chip architecture,
560
T
TEQ comparison instruction,
56,
618
Test-clean command, for D-cache cleaning,
428t,
434–435
32-bit interrupt controller register,
350f
32-bit/32-bit divide, unsigned
32-bit/15-bit divide by trial subtraction,
220–222
ARM-Thumb interworking,
90–92
branch instructions,
92–93
data processing instructions,
93–95
load and store offsets,
132t
multiple-register load-store instructions,
97–98
single-register load-store instructions,
96–97
software interrupt instruction,
99
stack instructions,
98–99
Trailing zeros, counting of,
215–216
Translation lookaside buffer
single-step page table walk,
507–508
Trial subtraction, division by
unsigned 64/31-bit divide by,
222–223
unsigned 32-bit/15-bit divide by,
220–222
unsigned 32-bit/32-bit divide by,
218–220
U
Undefined instruction vector,
33
Unique identification number,
398
Unknown_condition routine,
362
load instructions scheduling by,
169–171
Unsigned 64-bit by 64-bit multiply with 128-bit result,
209–210
Unsigned 64/31-bit divide, by trial subtraction,
222–223
Unsigned 32-bit/32-bit divide
Unsigned 32-bit/15-bit divide, by trial subtraction,
220–222
V
Variable-width bitstream packing,
192–194
Variable-width bitstream unpacking,
195–197
Vector floating point accelerator,
149
Vector floating-point,
37
Vector interrupt controller,
12
Vector interrupt controller PL190 based interrupt service routine,
333,
363–364
VIC PL190 based interrupt service routine,
333,
363–364
context switch procedure,
544
fixed system software regions,
521–522
memory management unit initialization
assigning of domain access,
541–542
page tables filled with translations,
531–538
page tables initialized in memory,
529–531
filling of, with translations,
531–538
initializing of, in memory,
529–531
regions in physical memory,
522–525