4 1. INTRODUCTION
Processing is module allows running signal processing algorithms for the lab experiments
that are written in C. is module processes and returns data provided by the I/O Handler.
1.3 OVERVIEW OF ARM PROCESSOR ARCHITECTURE
ARM (Advanced RISC Machine) is the processing engine that is used in modern smartphones.
e ARM architecture has been extensively used in embedded systems. Its designs are licensed
and incorporated into a wide range of embedded systems and low power mobile devices. e
ARM architecture refers to a family of reduced instruction set computing (RISC) architectures
produced by the company ARM. e most common architectures currently in use for mobile de-
vices are the ARMv7 architecture which supports 32-bit addressing/arithmetic and the ARMv8
architecture which supports 64-bit addressing/arithmetic. An overview of the ARMv7 archi-
tecture is provided next.
1.3.1 DATA FLOW AND REGISTERS
e RISC nature of the ARM architecture means that arithmetic operations take place in a
load/store manner. Figure 1.2 shows a diagram of the dataflow in an ARM core. ARM registers,
that are all of uniform 32-bit width, consist of 13 general purpose registers (r0 to r12) and
these 3 additional special use registers: stack pointer (SP or r13) which contains a pointer to
the active stack, link register (LR or r14) which stores a return value when a branch instruction
is called, and program counter (PC or r15) which contains a pointer to the current instruction
being executed. In addition, there is one special register called Current Program Status Register
(CPSR) which holds Application Program Status Register (APSR) and additional processor
state flags. APSR refers to the ALU status flag bits set by the previous instruction in bits 31
to 27 of CPSR. Starting with bit 31, these values indicate negative, zero, carry, overflow, and
saturation.
e execution pipeline varies between different versions of the ARM architecture. In-
structions can be either from the ARM instruction set, which consists of 32-bit instructions, or
from the umb instruction set, which consists of 16-bit instructions providing a compact data
processing capability.
Some other features of the ARM architecture include barrel shifter, shown as part of the
ALU in Figure 1.3, which is capable of performing logical left and right shifts, arithmetic right
shifts, rotate right, and rotate right extended operations on operand B. Another feature is the
ability to perform conditional execution. For instance, when decrementing an index as part of a
loop, the test for zero can be performed with no overhead as part of the subtraction operation;
the condition result is then used to break out of the loop. Other features such as the Advanced
SIMD (NEON) coprocessor [12] will be discussed in later chapters. Interested readers can refer
to [13] for additional and more detailed materials regarding the ARM architecture.
1.3. OVERVIEW OF ARM PROCESSOR ARCHITECTURE 5
Figure 1.2: ARM processor data flow.
Fetch Decode
5 Cycles 7 Cycles
15 Cycles for Integer Pipeline
Integer and
Branch
Multiply and
Load/Store
NEON/FPU
Write
Back
Instruction Issue
Write
Back
Write
Back
4 Cycles for Multiply/Load/Store
2 to 10 Cy
cles for NEON/FPU
Figure 1.3: ARM Cortex-A15 instruction pipeline.
6 1. INTRODUCTION
1.4 ORGANIZATION OF CHAPTERS
e chapters that follow are organized as follows. In Chapters 2 and 3, the smartphone software
tools are presented, and the steps one needs to take in order to create a basic smartphone app are
discussed. Chapter 2 covers the setup of the Android Studio programming environment, and
Lab L1 shows the development of a “Hello World app for Android smartphones.
Chapter 3 and Lab L2 are the counterparts of Chapter 2 and Lab L1 focusing instead on
the iOS operating system. Chapter 3 details the setup of the Xcode programming environment
and duplicates the “Hello World app from Lab L1. It also includes the debugging tool for iOS
smartphones.
Chapter 4 introduces the topics of signal sampling and frame-based processing, and the
steps that are required to interface with the A/D and D/A (analog-to-digital and digital-to-
analog) converters for audio signal input and output on a smartphone target. As part of this
process, the smartphone app shells for the Android and iOS smartphone platforms are covered
in detail. e Java and Objective-C shells are discussed, and the steps to incorporate C codes
are explained.
Labs L3 and L4 in Chapter 4 show how to sample an analog signal, process it, and pro-
duce an output in real-time on an Android and iOS smartphone target, respectively. Lab L3
covers the Android development environment, and Lab L4 the iOS development environment.
ese lab experiments involve processing a frame of signal samples captured by the smartphone
microphone. e frame length can be altered by the user through a graphical-user-interface
(GUI) settings menu. e sampling rate can also be altered depending on the sampling rates
permitted by the A/D converter of the smartphone target used. It is normally possible to alter
the sampling rate on a smartphone from 8–48 kHz. A low-pass FIR filter together with a user
specified delay are considered in this lab experiment. e delay is meant to simulate an addi-
tional signal processing algorithm running on the ARM processor of the smartphone. e delay
can be changed by the user through the settings menu, adding additional processing time to the
low-pass filtering time. By increasing the sampling frequency or lowering the sampling time
interval, data frames will get skipped and hence a real-time throughput cannot be met. Besides
skipped frames noted on the GUI, one can hear both the original signal and the filtered sig-
nal through the speaker of the smartphone and notice the distortion caused by skipped frames
due to the real-time demand. Distortion can also be experienced by increasing the processing
time delay, thus demonstrating that a real-time throughput is a balance between computational
complexity and computation rate. Processing of one frame of data needs to be done in less than
N dt sec in order to achieve a real-time throughput, where N denotes the frame length and
dt the sampling time interval. For example, for a sampling rate of 8 kHz and a frame length
of 256, the processing needs to be completed within 32 ms in order for all the frames to get
processed without any frames getting skipped.
In Chapter 5, fixed-point and floating-point number representations are discussed and
their differences are pointed out. Lab L5 in Chapter 5 gives suggestions on how one may cope
1.4. ORGANIZATION OF CHAPTERS 7
with the overflow problem. is lab experiment involves running an FIR filter on a smartphone
using fixed-point arithmetic. 16 bits are used to quantize the double precision floating-point fil-
ter coefficients generated by a filter design package. Due to quantization, the frequency response
of the filter is affected. e quantization word length can be adjusted in the settings menu and
the deviation of the frequency response magnitude can be observed in a graph displayed auto-
matically in the user interface. e settings menu allows the user to alter the quantization bits
to examine the deviation of the frequency response from the frequency response of the floating-
point implementation. In addition, due to quantization, overflows may occur depending on the
number of coefficients. is experiment shows how scaling can be used to overcome overflows
by scaling down input samples and scaling back up output samples generated by the filter.
Chapters 6 and 7 discuss common filters used in digital signal processing applications.
Lab L6 in Chapter 6 covers FIR (finite impulse response) filtering and Lab L7 in Chapter 7
shows how adaptive filtering can be used to perform system identification. e experiment in
Lab L7 exhibits adaptive filtering where an adaptive FIR filter based on the least mean squares
(LMS) coefficient update is implemented to match the output of an IIR (infinite impulse re-
sponse) filter. e error between the output of the adaptive FIR filter and the IIR filter for an
input signal is measured and displayed on the smartphone screen in real-time as the app runs.
Over time the error between the two outputs converges toward zero. e user can experiment
with the rate of convergence by altering the adaptive filter order through the settings menu with-
out needing to recompile the code. As the filter order is increased, it can be observed that the
convergence rate also increases. e drawback of increasing the filter order, that is an increase
in the processing time, can also be observed. is experiment allows one to see how a tradeoff
between convergence rate and real-time throughput can be established.
Chapter 8 covers frequency domain transforms and their implementation using frame-
based processing. Lab L8 explores the computational complexity of Fourier transform algo-
rithms and shows the utilization of Fourier transform for solving linear systems. e first part
of this lab experiment compares the computational complexity of discrete Fourier transform
(DFT) and fast Fourier transform (FFT) by first computing the DFT directly, having the com-
putational complexity of O.N
2
/, and then via FFT, having the computational complexity of
O.N log N /. In the second part of this lab, a filter is implemented in the frequency domain by
using Fourier transform three times. Frequency domain filtering is done by complex multipli-
cation between two transformed signals. is approach is observed to be more computationally
efficient than convolution when the length of the filter is made long.
Code efficiency issues are addressed in Chapter 9, in which optimization techniques, as
well as the use of intrinsics to access hardware features of the ARM processor, are discussed.
Lab L9 in this chapter provides a walkthrough of optimization techniques and their impact
on a signal processing app. In this lab experiment, the steps one can take to speed up code
execution on a smartphone target are covered. ese steps include changing compiler settings,
writing efficient C code, and using architecture-specific functions for the ARM processor. e
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.164.241