Index

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

CASE STUDY 6: DSP for Software Defined Radio

Index

Page numbers with “f” denote figures; “t” tables.

ABI, 221–226

Acceptable timeliness, 19–20

ACTIVE, variable

runtime behavior of program, 68f

ACTIVATE commands, 265–266

ACTIVATE/PRECHARGE command, 266

Adaptive differential pulse code modulation (ADPCM), 525

Adaptive Multi-Rate (AMR) codec, 423

ADC, see Analog-to-digital conversion (ADC)

ADI Blackfin DSP, 170

ADPCM, see Adaptive differential pulse code modulation (ADPCM)

Advanced Mobile Phone Systems (AMPS), 423

Alamouti scheme, 99

beamforming system, 99f

Algorithm complexity, 9

Aliasing, 7

Amdahl’s law, 232

A-Mode, 501, 501f

Analog signal processing (ASP), 3

vs. digital signals, 5

Analog systems, 3

Analog-to-digital conversion (ADC), 3–6, 8

for signal processing, 5f

Analog-to-digital converter, 7

data plotted over time, 8f

Antenna systems, multiple input multiple output (MIMO), 87

Antennas, 83, 602–603

API, multithreading disable, 328t

Apodization coefficients, 506

Application specific integrated circuits (ASICs), 31, 103, 107–108

Arithmetic processing unit (APU), 38

ARM instruction set, 68

ASCII numbers, 1

ASP, see Analog signal processing (ASP)

Assembly, caller procedure, 223f, 225f

Assembly language, 169–170

advantages/disadvantages, 170

DSP kernels, 169

Asymmetric multiprocessing (AMP), 300, 361

AMP style sharing, 301f

ATCA-9100 from Radisys, 529f

Audio/speech signal, 2

Auto-vectorizing compiler technology, 177–179

Matlab, Labview and FFTW-like generator suites, 178

Matlab and native compiled code, 178–179

silicon emulation, 179

Axial direction, 495

Azimuthal direction, 495

Azimuthal resolution, 496, 499

Barrier_Wait command, 305

Beamforming, 505–507, 512f

Beamforming MIMO-OFDM system

baseband representation of, 90f

BER plots comparing, 89f

Bilinear Transform technique, 124

Blackman wind, 122

Block processing model, 46

of DSP, 47f

Blood velocity, 503

B-Mode, 501, 502f

B-Mode image, 498, 516f

Boot process, 536

C, caller procedure, 222f

C language, 336

code, 234

C programming language

custom types, 174

finite impulse response (FIR) filter, 171

floating point, 173

fractional types and saturation, 172–173

function pragmas, 174

intrinsic functions, 172–174

with intrinsics and pragmas, 170–175

pragmas, 174–175

standard C integral types, 171

statement pragmas, 174–175

variable pragmas, 175

Cache line, set associativity by, 273f

Callee routines, 223–224

Caller assembly code

user_calling_convention, 226, 227f, 229f

Calling conventions, 191t

configuration of, 175f

generated code for function, 193f

invoking, 175f

Carrier frequency offset (CFO), 603

Catapult C, 145–149

HLS design process, 136

HLS design tool flow, 138f

Synthesis RTL design flow, 136f

Catch-all algorithms, 120

CDMA, see Code division multiple access (CDMA)

Cdma2000-1xEVDO systems, 423

Central Office, 524f

CFO, see Carrier frequency offset (CFO)

Channel matrix coefficients, 77–78

ChConfig, 543

Chip-level arbitration and switching system (CLASS), 278, 363

C-like programming language, 71

CMP.EQ instruction, 68–69

Code division multiple access (CDMA), 75–76, 423

Code optimization, 182, 579–580

additional optimization configurations, 185

analyzing compiled code, 185

basic C optimization techniques

data types, 188

basic compiler configuration, 183–184

endianness, 183

memory model, 184

target architecture, 183

cache accesses, 199

compiler optimization, 183

development tools, using, 183–185

DSP architecture, background, 186–187

resources, 186–187

enabling optimizations, 184–185

inline small functions, 199–200

intrinsics to leverage DSP features, 189–191

calling conventions, 190–191

functions, 190–191

loop transformations, 200

loop unrolling, 200–201

loops, 196–197

count information, communicating, 196–197

hardware, 197–198

memory contention, 199

multisamping, 201–202

pointers/memory access, 194–196

ensuring alignment, 194

restrict/pointer aliasing, 196

unaligned accesses, use of, 199

using profiler, 185

vendor DSP libraries, 200

CodeWarrior Development Studio, 374

CodeWarrior IDE, 381, 382f

CodeWarrior plug-ins, 381

Codewords (CW), 602–603

Color Doppler, 502, 502f

Common channel signalling system 7 (CCSS7), 526

Communication buses, 58f

Compiler loop optimization, loop unrolling, 230

Component off the Shelf (COTS), 533–534

Computer tomography (CT), 493

Control traffic, 540–541

CPE, see Customer premises equipment (CPE)

CPU, 580t

CPU load, 51

CPU speed, 48

Customer premises equipment (CPE), 523

CW, see Codewords (CW)

Cycle accurate simulators (CAS), 166

Cyclic prefix (CP) padding, 602–603

Data ALU (DALU), 187

Data buses, 129f

Data dependence analysis, 235

DDR controller, to memory connection, 264f

DDR memory, 64-bit, 266f

logical bank interleaving, 266f, 267

Delay compensation mechanism, 548–551

Delay phase

receive, 506f

transmit, 506f

Design process of DSPs

algorithm development and validation, 339–340

block diagram of general system design flow, 338f

challenges for, 343–344

concept and specification phase, 338–340

data visualization, 347

debugging, 347–348

development tool flow, 348

DMA function, 350

factory and field test, 343

generic data flow example, 348–354

graphical user interface (GUI), 345

high level system design and performance engineering, 341–342

Integrated Development Environment (IDE), 345–348, 346f

modeling tools, 344

real-time, 345

real-time analysis of system, 347–348

software development, 342

software performance engineering (SPE), 341–342

specification process, 339

standards and guidelines for algorithm, 340

system build, integration, and test, 342–343

system configuration tools, 348

system-level, 344

toolboxes, 344–345

Destructive interferences, 496

DHCP boot, 537f

Digital data, type of, 1

Digital loop carrier (DLC), 524

Digital signal processing (DSP) system, 3–4, 4f

advantages of, 2–3

changeability, 3

expandability, 3

reliability, 3

repeatability, 3

size, weight, and power, 3

algorithms, 13

analog signal processing (ASP), 3

analog-to-digital conversion (ADC), 3

applications for, 10–11

high performance, 13–14

low cost, 10–11

power efficient, 11–14

computer, 4

definition of, 1

digital, 1

digital-to-analog conversion (DAC), 4, 9

GSM voice codec, 280

measuring power consumption, 246–249

motor control systems, 10–11

Nyquist criteria, 6–9

output, 4

processing, 2

processor, 3–4

refrigeration compressors, 11

sampling errors, 5

sampling frequency, 5

signal, 1–2

signal source, 3

Digital signal processors (DSP), 337, 505, 571–572

algorithms, 336

application algorithms, 337

architectural features of, 45f

challenges in application development, 337

code build tools, 354–358

communication mechanism, 336

design process, 337–343

development environments, 336

early, 336

evaluation module, 358

generic data flow example, 348–354

host development tools, 345–348

phases of development, 336, 336f

software development using, 335–336

starter kit, 358

Digital-to-analog conversion (DAC), 9, 10f, 116

Direct memory access (DMA), 276

three-dimensional, 277f

Discrete DDR3 memory chip’s rows/columns

basic drawing of, 263f

Discrete Fourier transform (DFT), 119

DLC, see Digital loop carrier (DLC)

Doppler angle, 504

Doppler effects, 501–504

A-Mode, 501, 501f

B-Mode, 501, 502f

color Doppler, 502, 502f

M-Mode, 501, 501f

power Doppler, 502, 502f

spectral Doppler, 502–503, 503f

DSP, see Digital signal processors (DSP)

DSP acceleration decisions

computational complexity, 41

data locality, 41–43

signal processing algorithm parallelism, 41

DSP algorithms

aliasing, 116

applications of, 113–114

basic system, 116–119

block filtering, 128

circular buffers, 130

convolution, 119

correlation, 120

filtering, 118–119

FIR filter, 118–119

FIR filter, design, 120–121

Parks-McClellan algorithm, 120–121

frequency analysis, 119–124

IIR filter, 119

implementation, 124–126

FIR filter, 128

number format, 125

overflow and saturation, 126

MAC instruction, 128

on-chip RAM, 127–128

program/data buses, 128–129

system issues, 130

systems and signals, 114–116

windowing, 120–121

zero overhead looping, 129–130

DSP applications

profiling and determining hot spots, 57f

DSP architectures, 46, 124–125

DSP code optimization, 56f

DSP core

32-bit multiplication, 189

example intrinsic, 188

high-level architectural comparison of, 186

DSP Daughter Card, 529f

DSP design tool, 140

DSP development process, 55f, 59–61

DSP IDE, main components of, 52f

DSP kernel, 161

DSP operation systems, 292

connected to host, 310f

connected to network, 309f

memory management

barrier, 305f

memory allocation, 305–306

virtual memory and memory protection, 306

multicore considerations, 298–305

peripherals sharing, 302–305

synchronization primitives, 304–305

networking

inter-processor communication, 306–309

internetworking, 309–310

OS fundamentals, 292–293

processes, threads and interrupts, 294–298

real-time constraints, 293–298

scheduling, 310–329

blocking vs. non-blocking jobs, 312

cooperative scheduling, 312–313

deadline monotonic, 323

disabling, 328

dynamic priority, 323–325

multicore considerations, 313

offline scheduling, 314–320

offline vs. online, 325

online scheduling, 321

preemptable vs. non-preemptable scheduling, 312

priority ceiling, 329

priority inheritance, 328

priority inversion, 325–329

rate monotonic, 321–323

reference model, 311–312

static priority, 321–323

types of, 313

software interrupt, 297–298

tools support for, 329–331

DSP processor, 36f, 48–49

DSP RTOS component architecture, 53f

DSP SoCs, 57–59

advanced, 58f

visibility, 60f

DSP Software Code Optimization, 281

DSP software development, 51–52

DSP starter kit, 59

DSP system

basic, 117f

basic I/O for, 131f

computing the channels, 49f, 50f

evaluation board, 59f

top eight to ten performance intensive algorithms, 58f

DSP VoIP framework differentiators, 551–569

DTMF detection and transmission, 551–557

Goertzel filters, 557–569

notch filters, 561–562

peak filters, 560–561

power estimation module, 562–563

sections, 552–557

DSP-based embedded system, 43

DSPFWAPI, 533–535

DTMF detection and transmission, 551–557

DTMF frequency allocation, 550f, 554t

Dual data rate (DDR), see DDR

Dual inline memory module (DIMM), 262

Dual tone multi frequency, 552

Dynamic host configuration protocol, 536–537

Echo cancelling, 547–548

Echo in telephone networks, 532–533

Echo processing, 515–520

Echo source, 532f

Eclipse based development environment, 381–382

EDF algorithm, 323, 324f

Electrical attenuation, 524

Embedded C, 176

Embedded digital signal processing, 337

Embedded systems, 6, 23–26, 29–30

C++ for, 176–177

characteristics of, 26

components, 24f

DSP, 31

DSP solution, 32f

lifecycle using DSP, 30–34

acceleration decisions, 41–46

basics and architecture, 44–46

code tuning and optimization, 53–54

development flow, 54–61

digital signal processors, 35–40

FPGA solutions, 34–40, 35f

general purpose processors (GPPs), 33

hardware components, 31

hardware gates, 31–32

input/output options, 48

microcontrollers, 33–34

models of, 46–53

needs of system, 30–31

performance, calculating, 48–51

product design, 30–31

signal processing solution, 40–41

SoC, 60f

software, 51–53

software programmable, 32–33

model of sensors and actuators, 25f

reactive systems, 25–26

real-time systems, 20

sequence enumeration, 589–595

system requirements, 587–597

Enea’s LINX, 307

Enhanced Full Rate (EFR), 423

Envelope detection, 515

Ethernet frames, 302–303

Ethernet switch subsystem, 303f

F# (f-number), 496

Fast Fourier transform (FFT), 47

Field programmable gate arrays (FPGAs), 31, 77

Filter frequency response, low pass, 121f

Filtering, 113

Finite impulse response (FIR), 118

filter, 171

C code, 172f

with intrinsic, 173f

re-written with intrinsic, 173f

using SPE intriniscs, 39f

FIR diagram, 45–46

signal flow graph for FIR filter, 45f

FIT filter, basic, 38f

Flex-Sphere, block diagram of, 82f

Flex-sphere tree traversal, 80

FORTRAN routines, 178

4G technologies, 76

FPGA-based system, 104

FPGA resource utilization, 85t, 86t

FPGA solutions, 35f

Freescale DSP cores, 187

Freescale MSC8156 series, 110

Freescale StarCore CodeWarrior

compiler, 171t, 172t, 174t

IDE, 249

Freescale StarCore DSPs, 170f

Freescale StarCore SC3850 DSP architecture, 65–66

Freescale’s MSC8157, 536, 605

Freescale’s SmartDSP OS, 300–301

Frequency division duplexing (FDD) mode, 92

Frequency shift, 504

Fresnel region, 495

Gantt chart, 147f

for loop unrolling, 148f

GateMutexPri module, 328

Gateway, 527

Gaussian processes, 504

General purpose processor (GPP), 31

Generated assembly code vs. example loop, 197

Get_Upper/Lower intrinsics, 39

Global System for Mobile Communications, 423

GNU GCC compiler, 218

Goertzel filters, 557–569

Graphic configuration tool, 331f

GSM voice codec, 280

Hall effect IC voltage, 248f

Hamming window frequency response, 123f

Hardware acceleration in DSP systems, 443

Hardware/software continuum, DSP, 97–99

application driven design, 111

application specific integrated circuits (ASICs), 103, 107–108

architectures, 110–111

embedded cores, general purpose, 109–110

FPGA, in embedded design, 104–107

algorithm suitability, 105

ASICs, advantages of, 108

computational throughput and power, 105

fixed point vs. floating point, 105–106

implementation challenges, 106–107

software programmable digital signal processing, 108–109

HDTV, 29–30

High level synthesis (HLS)

abstraction, 133

benefits of derive, 134–135

Catapult C, 135–141

matrix multiplication design, 145–149

for complex DSP applications, 133–134

high-level design tools, 135

language, 133–134

low-density parity-check (LDPC) codes

using PICO, 141–144, 144f

objective of, 134

analysis feedback, 134

RTL implementation, 134

verification artifacts, 134

PICO C-Synthesis, 138–140

pipeline of processing arrays (PPA), 138–140

RTL module, 137

System Generator, 140–141

QR decomposition design, 149–154

user specified constraints, 134

design hierarchy, 134

interface constraint, 134

memory architecture, 134

performance, 134

target hardware, 134

High level systems, 504–515

High speed serial interface (HSSI), 258

Hilbert transformation, 517

Host control application, 541

Host processor baseboard (PDK), 528f

HRPD (high rate packet data) system, 423

HSPA NodeB, 603

Hybrid automatic repetition request (HARQ), 603

IEEE rounding modes, 72

Imaging modes, 501–503

Impulse response, 115f

IMT-2000 initiative, 423

Infinite impulse response (IIR), 122–123

filters, 118

Inheritance algorithm, 329

Instruction set simulator (ISS), 166

Integrated circuit technology, 337

Integrated development environments (IDE), 337, 345–348, 346f, 381–382

default perspectives, 386

project panel, 386

Interactive voice response (IVR), 526

Internal components/functions, 535

International Telecommunication Union (ITU), 423

Internet engineering task force (IETF), 527

Inter-procedural optimizations, 170

Interrupt priority level (IPL), 253

Interrupt service routine (ISR), 296

IP based transport, 526–528

IP protection, 219

IP/Ethernet, 527

IPSec implementations, 310

ISDN, 523

ITU-T V.8, 544

Japanese-TACS (JTACS), 423

Job characteristics, 323t

Job parameters, 315t

Joint Test Action Group (JTAG), 25

JTAG connection, 330

K MRC blocks, 91

Legacy equipment, 545–551

Level control unit, 564

Level Control Unit (LCU) modules, 563–567

Linear interpolator, 564

Linux, 307

Log likelihood ratio (LLR), 143, 603

Log viewer, 330f

Long term evolution (LTE) systems, 423–425

advance baseband hardware co-processors, 425

architecture, 424f, 425–446

barriers and locks for multi-core synchronization, 427f, 442–443

bit scrambling, 428–433

channel coding, 426–427

code block segmentation, 425–426

CRC generation and insertion, 425–426

creating set of jobs, 10

data modulation, 428–431

deadlock prevention and data protection, 441–442

DL physical layer processing, 437f

downlink channel, 425–443, 426f

dynamic scheduling, 445–446

eNodeB physical layer, 425

eNodeB shared data uplink processing chain, 452f

hardware acceleration, 443, 464

inter-core communication, 443–446

layer mapping and pre-coding, 431–433

load balancing, 442–443

multi-core digital signal processors, 438–441

OFDMA symbol generation, 433–434

parallelism and pipelining, 435f, 442–443

physical resource-block mapping module, 433–442

point to point message posting, 8f, 428f, 431

rate matching and hybrid ARQ functionality, 426–427

shared memory space and CACHE coherency, 428–429

static scheduling, 434f

sub-frame pipelining, 444f

system components and design, 434–438

triggering of sequential and parallel processes, 443

24 bit CRC (CRC24_B) insertion, 425–426

UL chain processing, 440

UL symbol level processing, 6

Loops

dependence analysis, 234, 235f

unrolling, 230

vectorization of, 233f

Low-density parity-check (LDPC) codes, 141–142

LTE eNodeB, 602–603

MAC address, 302–303

MAC instructions, 64

Magnetic resonance imaging (MRI), 493

MAPLE accelerator, 258

MATLAB functions, 124

MATLAB remez function, 121

Maximum-likelihood (ML) detector, 78

Maximum ratio combining (MRC) vector, 89–90

Media Channel, 534

Media gateway, 532–541

controller, 528

system software functionalities, 535–541

TDM to IP processing path, 541–545

Media processing element, 549

Medical devices, DSP for

beamforming, 505–507

Doppler effects, 503–504

echo processing, 515–520

high level systems, 504–515

imaging modes, 501–503

medical imaging, 493–494

medical ultrasound, 494

ultrasound, 494–499

Medical imaging, 493–494

Medical ultrasound, 494

images, 505

Memory layout optimization, 231–240

arrays of data structures, 236–238

data alignment’s rippling effects, 238–239

data types selection, 239–240

loop optimizations, for performance, 238

optimization efforts, 232–233

overview of, 232

pointer aliasing in C, 235–236

vectorization and dynamic code-compute ratio, 233–236

Memory management

barrier, 305f

memory allocation, 305–306

memory protection OS, 306

virtual memory and memory protection, 306

Memory management unit (MMU), 294

Memory optimization, 217–218

arrays format, structure of, 238f

auto-vectorizing compiler technology, 238

code size, 218–231

ABI, tuning, 221–226

compiler flags/flag mining, 218–219

compiling code, 226–231

size/performance tradeoffs, target ISA, 219–221

data structure, unit memory stride, 237f

example data structure, 236f

flag mining, 218

kernels, performance, 239–240

memory layout optimization, 231–240

arrays of data structures, 236–238

data alignment’s rippling effects, 238–239

data types selection, 239–240

loop optimizations, for performance, 238

optimization efforts, 232–233

overview of, 232

pointer aliasing in C, 235–236

vectorization and dynamic code-compute ratio, 233–236

restrict keyword, 236f see also Memory layout optimization

Memory optimizations, 232

MessageHandler() function, 260

MEX file format, 179

Microcontroller, 36

Microcontroller solutions, 34f

Microprocessors (uP), 33

Min Finder, 83

Minimum mean squared error (MMSE), 603

MJPEG code, 257

M-Mode, 501, 501f

Mobile terminal, 599

Modified real-valued decomposition (M-RVD), 79, 84–85

ordering, 81–82

Modulo addressing mode, 70

Modulo scheduling, 67

Moore’s law, 24–25

Motion JPEG application, using MSC8144 DSP

AC coefficients, 371

design considerations, 372–373

discrete cosine transfer (DCT), 370

Huffman coding, 372

inter-core communication, 373

JPEG encoding process, 369–372, 370f

Minimum Coded Units (MCUs), 369–370

output video stream, 373

quantization step, 371

run-length coding (RLC), 371–372

scheduling, 372–373

zig-zag reordering, 371

MPC5554, 36, 37f

MSC8144, 541

block diagram, 362f

Media Gateway for a voice over IP (VoIP) system, 364

memory system components, 363

MSC8156, 272, 509–511, 520

address generation units (AGU), 512

data arithmetic logic units (DALU), 512

MSC8156 block diagram, 300f

MSC8156ADS board, 260

MSC8156’s Maple, 299

MSC8157 device, 37

MSC815x series DSPs, 278–279

Multi Instruction and Multi Data model, 391

Multicore communication application programming interface (MCAPI), 309

Multicore processing models, 363–367

application memory map, 391–393

breakpoints, 408–409, 409f

build and link the application for, 389–403

Code Coverage view, 419–421, 420f, 421f

CodeWarrior connection, 404, 416–417

compiler configuration for application, 393–399

considerations, 364t

creating new connections, 404–405

Critical code menu, 418–419, 418f, 419f

debugger actions, 406–411

DPU workflow, 414, 414f

DSP (SDOS) operating system, 389–391, 391f

executing and debugging application, 403–411

hardware breakpoints, 409–410, 410f

linker configuration for application, 400–403

MMU configuration tool, 411, 413f

motion JPEG application, 369–373

multiple-single-cores software model, 364–366

Performance view, 421–422, 422f

porting guidelines, 367–379

project editing options, 390f

set-up launch configuration, 406, 407f

software analysis setup, 414–417, 415f

target configuration and verification, 411, 412f

Trace submenu, 417–418

tracing and profiling, 414–422, 417f

true-multiple-cores model, 366–367, 373–379

variable length instructions sets (VLES), 411

VTB location, 415–416, 416f

Multimedia Broadcast Multicast Services (MBMS), 425

Multiple input multiple output (MIMO)

antenna systems, 87

model, 78

techniques, 76

Multiple-single-cores software model, 364–366, 365f

advantages, 364–365, 365t

disadvantages, 365, 366t

general characteristics of an application, 366

Multiply-accumulate (MAC), 34, 128, 281–282

instruction, 269

Multiply-accumulate operations per second (MMACS), 363

Network coprocessor (NETCP) peripheral, 303

Network protocols, 539

New project, creating

demo, 383–386

Import dialogue, 386, 387f

project settings, 386

wizard, 383, 384f, 385f

workspace, 382, 383f

Nonrecurring engineering (NRE) costs, 32

NOP test, 249

Nordic Mobile Telephone Systems (NMT), 423

Notch filters, 561–562

Nyquist frequency, 7, 118

Nyquist limit, 504

Nyquist theorem, 6

reconstructed waveform, 7f

signal sample, 7f

OFDM system, 97, 109–110

Off-chip memory, 337

On chip emulator (OCE), 257

On-chip memory, 337

Optical channel (OC), 525

Optimization process, basic flow of, 170f

Optimizing DSP software, 157–158

build tools, protecting, 161–162

code placement, flexibility, 162

DSP kernel, isolating, 161–162

measurement, measuring, 165–168

excluding non-related events, 165

hardware measurement, 166–167

interrupts, 165

profiling results, 167–168

results, interpret, 168

runtime library code, 166

simulated measurement, 166

performance measurement, methods

hardware timers, 164

performance counter-based measurement, 164

profiler-based measurement, 164–165

time-based measurement, 164

system effects, 163

multicore/multidevice environment, execution, 163–165

RTOS overhead, 163

test harness inputs, outputs, and correctness checking, 159–161

true system behaviors, modeling, 162–163

cache effects, 162

memory latency, 163

writing, test harness, 158–161

Orthogonal frequency division multiplexing (OFDM), 76, 87–88

Packet accelerator (PA), 303

Packet-switching technology, 546

Parks-McClellan algorithm, 120–121

Partial Euclidean Distances (PEDs), 78–79

Partition, 601–602

PCM encoding, 547

Peak filters, 560–561

Peak to average power ratio (PAPR), 601

Performance accurate simulators (PACC), 166

Personal digital assistants (PDAs), 11

Phase distortions, 546–548

PHYSICAL banks, 263

Physical layer (PHY), 76

PICO, pipelined LDPC decoder architecture, 144f

PICO C-Synthesis, 138

system level design flow, 139f

Pipeline of processing arrays (PPA), 138–140

Pipelined System Generator block diagram, 84f

Plain Old Telephone Service (POTS), 523

Pointer aliasing, illustration of, 196

Porting guidelines, multicore processing models, 367–379

design considerations, 367–369

POSIX-style signal, 319

Power architecture code, 220

Power architecture cores, 219–220

Power consumption, software optimization, 11–12, 242

algorithmic optimization

compiler optimization levels, 280–281

eliminating recursion, 284–286

instruction packing, 281

loop unrolling, 281–282

software pipelining, 282–284

application’s, profiling, 249–251

average power, 245

cellular phone, 243, 252

clock and voltage control, 255–261

during application runtime, 259–261

at application start up, 258–259

in low power modes, 256–261

clock rate, 244

core component utilization, 250f

current flow, 244

data flow, 261–276

DDR overview, 262–264

memory accesses, reducing power consumption, 261–262

DDR data flow, 264–276

array merging, 275

cache coherency functions, 274–275

cache utilization, 270

compiler cache optimizations, 275–276

data transitions/power consumption, 270

DDR burst accesses, 267–268

DDR configuration, 267

explanation of locality, 271

interchanging, 275

memory layout for cache, 273–274

optimizing memory software data organization, 267

optimizing power by timing, 265–266

optimizing with interleaving, 265–266

set-associativity, explanation of, 272–273

SoC memory layout, 270

SRAM/cache data flow optimization, 268

SRAM power consumption and parallelization, 269–270

write back vs. write through caches, 274

eliminating recursion

low-power code sequences, 286

hardware support, 251–255

clock gating, 252

Freescale’s MSC815x low power modes, 253–254

low power modes, 251–252

power gating, 252

Texas Instruments C6000 low power modes, 256–261

leakage consumption, 248

measurement, 246–249

using ammeter, 246–247, 247f

using Hall Sensor type IC, 247

voltage regulator module (VRM) power supply controller ICs, 247–249

minimizing, 251–255

peripheral/communication utilization, 276–286

coprocessors, 278

to core communication, 279–280

DMA of data vs. CPU, 277–280

interrupt processing, 280

polling, 279–280

speed grades and bus width, 279

system bus configuration, 278–279

time based processing, 280

static vs. dynamic, 244–246

STOP/WAIT instructions, 253

understanding, 243–246

Power consumption savings

in PD modes, 261f

Power Doppler, 502, 502f

Power estimation module, 40

Power optimization techniques for DSP, 287t–288t

PPA architecture template, 139f

Precedence graph, 315f

PRECHARGE, 264

Priority inversion, 326f

Private branch exchanges (PBX), 526

Procedure inlining, 230, 231f

Processing elements (PE), 605

Processing node, implementation of, 152f

Processing with respect to shared channel data (PUSCH), 436

Processor, 605

Processor clock cycles, 220–221

Processor solutions, general purpose of, 33f

Programmable DSP architectures, 337

C data operations, 71–73

features of, 66f

DSP core/ISA, 63–69

DSP kernels, 65

predicated execution, 67–69

programmable DSP space, 64

SIMD operations, use of, 65–67

Freescale StarCore SC3850 DSP architecture, 65–66

memory architectures, 70–71

access sizes, 70–71

alignment issues, 71

Public switched telephone network (PSTN), 523, 525

architecture, 524f

Pulse code modulation (PCM), 525

Pulse repetition frequency (PRF), 504

Pulsed wave approach, 501f

PUSH/POP style, 226

PWM switching, 11

QR decomposition system, 153f

Quality of service (QoS), 59

QUICC Engine, 364, 372, 377–378

RAM ports, 110–111

Rate monotonic method, 321–322, 322t

Real-time development tools, 336

Real-time environments, 557

Real-time systems, 1–2, 5–6, 19–22

definition of, 15

DSP systems, 17–18

efficient execution/execution environment, 19–20

centralized resource allocation/management, 23

challenges, 20–22

initialization, 22–23

load distribution, 23

multi-processor systems, 22–23

processor interfaces, 23

recovering from failures, 22

resource management, 19–20

response time, 21–22

event characteristics, 19

execution environment, characteristics, 18

hard, 17–18

inputs and outputs, 16f

multi-processor system, 22

soft and hard, 16

vs. time-shared systems, 16–17

usefulness of results, 293f

Real-world signals, 1–2

Recursion, cost, 285

Resource elements (RE), 602–603

RF demodulation methods, 517f

RMA, see Rate monotonic analysis (RMA)

ROM boot code, 536

Rotating, implementation of, 151f

RPTB instruction, 129

RTL code generation, 137f

RTL design flow, 136f

RTL implementation, 133–134

RTOS overhead, 163

RTOSes, 314

RTTI functionality, 177

Run to completion procedure, 316

RX Beamforming, 504–505

SC3850 core, 170

SC3400 DSP, 363

SC3850 prefetch, 268–269

Scan conversion, 520

Scan lines, 499, 499f

Schnorr-Euchner (SE) ordering, 80

Serial Rapid I/O (SRIO), 276

Signal, 1–2

circular buffers of, 66f

Signal processing engine (SPE)

architecture of, 37f

DSP capabilities, 39f

Signal processing solution, 42f

Signal processing system, 4

Signal transmissions

diagram of, 98f

voltage, 8

Signaling System 7 (SS7), 526

Single instruction multiple data (SIMD)

architecture processing engine, 36

capability, 43

extensions, 64

functionality, 38

vector, 70–71

hardware, 234

Single-carrier frequency division multiplexing (SC-FDMA), 603

SmartDSP Operating System (SDOS), 296, 314, 317, 328–329, 374, 383

motion JPEG demo, 257f

SoC level memory configuration, 270

Software architecture, 607–609

control plane, 608f

Software defined radio (SDR), 599–609

functional architecture of base station, 601–605

joint architecture, 604–605

LTE eNodeB, 602–603

partition, 601–602

processor, 605

UMTS and HSPA NodeB, 603

software architecture, 607–609

Software development team, 575–576

Software development using DSPs, 335–336

Software interrupts (ISR), 297, 312

Software performance engineering (SPE), 65

assessment, 573

initial performance estimates, 573

measurement error, reducing, 581–583

project description, 571–583

tracking and reporting the metrics, 575–581

Software pipelining, 229–230, 282

SONET, see Synchronous optical network (SONET)

Source code, 69f

Space time codes (STC), 87

SPE, see Software performance engineering (SPE)

Spectral Doppler, 502–503, 503f

Spectrum analysis, 113

Speech coding algorithms, 529–530

SRAM memory, 261–262, 268

SRIO port, 279

Stages of DSP development process, 358–360

StarCore, 52, 298

StarCore cores

fractional and integer operations, 173

StarCore DSPs

full bus usage with quad-word move, 194

StarCore processors, 191

State-of-the-art smartphones, 599–600

STATUS, variable

runtime behavior of program, 68f

Subscriber loop carrier (SLC), 524

SWI, stack, 299f

Switched circuit network, 527

Symmetric multiprocessing (SMP) model, 361

Synchronous interface (CPRI), 607

Synchronous optical network (SONET), 525

SYS/BIOS, 298, 303, 324, 329

System architecture, 150f

System Generator

MATLAB M-code, 140

System implementation, 61

T1 frame format, 526f

Task control block (TCB), 298

Taylor approximation, 495, 497–498

T-carrier, 525

TCP/IP stack, 309

TDM interface, 316

TDM-IP channel, 534f, 543–544

TDM-IP media gateway, 528–531

TDM to IP path, 540f

Teager-Kaiser algorithm, 563–564

Telephone networks, echo in, 527

3rd generation partnership project (3GPP), 75–76, 423

32-Bit embedded power architecture device, 219

Threading characteristics, 313t

3G Media Gateway, 527

3GPP WCDMA, 603

Time division multiplexed (TDM) link, 525

Time slot interchange (TSI) devices, 530

TIPC (Inter process communication protocol), 307

Ti’s KeyStone architecture, 303

TK loops, 567–569

TMS320c5500 assembler language, 127

TMS320C6000 Optimizing Compiler, 188

Total Access Communication Systems (TACS), 423

Transducers, 500f

True-multiple-cores model, 366–367, 367f

advantages, 367, 368t

CodeWarrior IDE, 374

data input/output process, 378–379

disadvantages, 367, 368t

implementation of, 373–379

initialization process, 377

inter-core communication, 377–378, 377f

Kernel Awareness plug-in module, 374

master-slave approach, 373, 374f

scheduler functionality, 375–376, 375f

SDOS operating system, 376

serialization, 378–379

WAIT state, 376

UCC Ethernet Controller (UEC), 541

Ultrasound, 494–499

design use, 507–515

Ultrasound imaging, 493

Doppler effects in, 503

limitation of, 494

Ultrasound system, 505f

Ultrasound transducers, 499–500

Unbounded priority inversion, 327f

The Unified Instrumentation Architecture (UIA), 105–106, 329

Universal Mobile Telecommunications System (UMTS), 32, 423

User interface functions, 13

User’s source code, 220

Variable length execution set (VLES), 268–269

Vectoring, implementation of, 151f

Verilog HDL (VHDL), 106, 135

Very Long Instruction Word (VLIW), 13–14

ALUs, 70–71

architecture, 64

Virtual circuits, 525

Voice activity detector, 543

Voice codec, 547

Voice processing, 529–530

software architecture of, 533f

Voice-band data (VBD) mode, 545–546

“Voice-to-voice” codec, 551

VoIP applications, DSP role in, 528–532

framework, 531–532

framework differentiators, 545–551

delay compensation mechanism, 548–551

legacy equipment, 545–551

phase distortions, 546–548

media gateway, 532–541, 545–546

system software functionalities, 535–541

TDM to IP processing path, 541–545

TDM-IP media gateway, 528–531

VoIP domain, 523–528

migration to IP based transport, 526–528

wired TDM telecom network, 523–526

Voltage ID (VID) parameters, 255

Voltage regulator modules (VRMs), 246, 255, 268

WARP nodes, 97

WARPLab setup, 96, 96f, 97f

experiment setup, 97f

Watchdog timer, 538f, 539

Waveform generators (WV), 601–602

WCDMA transmitter, 608f

Wideband CDMA (WCDMA) system, 423

WiMAX codebooks, 89–90, 92–93

channel quantization, 93t

WiMAX Frequency Division Duplexing (FDD) mode, 87

WiMAX standard, 91–92

WiMAX system, beamforming of, 94f

Wired TDM telecom network, 523–526

Wireless baseband software on multi-core, 448

adoption of advanced multi-core embedded platforms, 448

advantages, 459

Agile practices, 449

blocks and modules, 452–455

considerations, 459–461

DMA copy vs. MEMCPY, 490

migrating from single-core to multi-core SoCs, 461–472

modular software design, 449

P4080, example, 457–458, 472–484

parameters, 488

process principles, 449–451

quality principles, 449–451

refactoring, 450

reuse of software, 450–451

single core application, 455–457

software tools, 451

tips and tricks, 484–490

Wireless communications applications

code division multiple access (CDMA), 75–76

field programmable gate arrays (FPGAs), 77

flex-sphere detector, 79–81

tree traversal for, 79–81

4G technologies, 76

modified real-valued decomposition (M-RVD), 84–85

timing analysis, 85

modified real-valued decomposition (M-RVD) ordering, 81–82

multiple antenna (MIMO) system, 76f, 77–79

SDR handset detector, FPGA implementation of, 82–84

configurable design, 83–84

modulation order, 83–84

number of antennas, 83

PED computations, 82

simulation results, 86–87

third generation networks (3GPP), 75–76

WiMAX, beamforming for, 87–99

computational requirements and performance, 91–93

experiment setup, 97–99

WARPLab, experiments, 94–97

WARPLab framework, 94–97

wideband systems, 87–91

Xilinx FPGA implementation, 85–86

Wireless Open Access Research Platform (WARP), 94

with radio board, 95f

Write-back cache scheme, 274

Xilinx blockset, 140

Xilinx Blockset/Memory, 151

Xilinx System Generator implementation of Flex-Sphere detector, 88f

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Table of Contents for
Index