Software Package of Lab Codes

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

4 1. INTRODUCTION

Processing is module allows running signal processing algorithms for the lab experiments

that are written in C. is module processes and returns data provided by the I/O Handler.

1.3 OVERVIEW OF ARM PROCESSOR ARCHITECTURE

ARM (Advanced RISC Machine) is the processing engine that is used in modern smartphones.

e ARM architecture has been extensively used in embedded systems. Its designs are licensed

and incorporated into a wide range of embedded systems and low power mobile devices. e

ARM architecture refers to a family of reduced instruction set computing (RISC) architectures

produced by the company ARM. e most common architectures currently in use for mobile de-

vices are the ARMv7 architecture which supports 32-bit addressing/arithmetic and the ARMv8

architecture which supports 64-bit addressing/arithmetic. An overview of the ARMv7 archi-

tecture is provided next.

1.3.1 DATA FLOW AND REGISTERS

e RISC nature of the ARM architecture means that arithmetic operations take place in a

load/store manner. Figure 1.2 shows a diagram of the dataﬂow in an ARM core. ARM registers,

that are all of uniform 32-bit width, consist of 13 general purpose registers (r0 to r12) and

these 3 additional special use registers: stack pointer (SP or r13) which contains a pointer to

the active stack, link register (LR or r14) which stores a return value when a branch instruction

is called, and program counter (PC or r15) which contains a pointer to the current instruction

being executed. In addition, there is one special register called Current Program Status Register

(CPSR) which holds Application Program Status Register (APSR) and additional processor

state ﬂags. APSR refers to the ALU status ﬂag bits set by the previous instruction in bits 31

to 27 of CPSR. Starting with bit 31, these values indicate negative, zero, carry, overﬂow, and

saturation.

e execution pipeline varies between diﬀerent versions of the ARM architecture. In-

structions can be either from the ARM instruction set, which consists of 32-bit instructions, or

from the umb instruction set, which consists of 16-bit instructions providing a compact data

processing capability.

Some other features of the ARM architecture include barrel shifter, shown as part of the

ALU in Figure 1.3, which is capable of performing logical left and right shifts, arithmetic right

shifts, rotate right, and rotate right extended operations on operand B. Another feature is the

ability to perform conditional execution. For instance, when decrementing an index as part of a

loop, the test for zero can be performed with no overhead as part of the subtraction operation;

the condition result is then used to break out of the loop. Other features such as the Advanced

SIMD (NEON) coprocessor [12] will be discussed in later chapters. Interested readers can refer

to [13] for additional and more detailed materials regarding the ARM architecture.

1.3. OVERVIEW OF ARM PROCESSOR ARCHITECTURE 5

Memory

pc Result

Barrel Shifter

Address

Multiply

Accumulate

Arithmetic Logic

Unit + Barrel

Shifter

Address Register +

Loop Incrementer

Instruction

Decoder

Operand A Operand B Accumulator

Figure 1.2: ARM processor data ﬂow.

Fetch Decode

5 Cycles 7 Cycles

15 Cycles for Integer Pipeline

Integer and

Branch

Multiply and

Load/Store

NEON/FPU

Write

Back

Instruction Issue

Write

Back

Write

Back

4 Cycles for Multiply/Load/Store

2 to 10 Cy

cles for NEON/FPU

Figure 1.3: ARM Cortex-A15 instruction pipeline.

6 1. INTRODUCTION

1.4 ORGANIZATION OF CHAPTERS

e chapters that follow are organized as follows. In Chapters 2 and 3, the smartphone software

tools are presented, and the steps one needs to take in order to create a basic smartphone app are

discussed. Chapter 2 covers the setup of the Android Studio programming environment, and

Lab L1 shows the development of a “Hello World” app for Android smartphones.

Chapter 3 and Lab L2 are the counterparts of Chapter 2 and Lab L1 focusing instead on

the iOS operating system. Chapter 3 details the setup of the Xcode programming environment

and duplicates the “Hello World” app from Lab L1. It also includes the debugging tool for iOS

smartphones.

Chapter 4 introduces the topics of signal sampling and frame-based processing, and the

steps that are required to interface with the A/D and D/A (analog-to-digital and digital-to-

analog) converters for audio signal input and output on a smartphone target. As part of this

process, the smartphone app shells for the Android and iOS smartphone platforms are covered

in detail. e Java and Objective-C shells are discussed, and the steps to incorporate C codes

are explained.

Labs L3 and L4 in Chapter 4 show how to sample an analog signal, process it, and pro-

duce an output in real-time on an Android and iOS smartphone target, respectively. Lab L3

covers the Android development environment, and Lab L4 the iOS development environment.

ese lab experiments involve processing a frame of signal samples captured by the smartphone

microphone. e frame length can be altered by the user through a graphical-user-interface

(GUI) settings menu. e sampling rate can also be altered depending on the sampling rates

permitted by the A/D converter of the smartphone target used. It is normally possible to alter

the sampling rate on a smartphone from 8–48 kHz. A low-pass FIR ﬁlter together with a user

speciﬁed delay are considered in this lab experiment. e delay is meant to simulate an addi-

tional signal processing algorithm running on the ARM processor of the smartphone. e delay

can be changed by the user through the settings menu, adding additional processing time to the

low-pass ﬁltering time. By increasing the sampling frequency or lowering the sampling time

interval, data frames will get skipped and hence a real-time throughput cannot be met. Besides

skipped frames noted on the GUI, one can hear both the original signal and the ﬁltered sig-

nal through the speaker of the smartphone and notice the distortion caused by skipped frames

due to the real-time demand. Distortion can also be experienced by increasing the processing

time delay, thus demonstrating that a real-time throughput is a balance between computational

complexity and computation rate. Processing of one frame of data needs to be done in less than

N  dt sec in order to achieve a real-time throughput, where N denotes the frame length and

dt the sampling time interval. For example, for a sampling rate of 8 kHz and a frame length

of 256, the processing needs to be completed within 32 ms in order for all the frames to get

processed without any frames getting skipped.

In Chapter 5, ﬁxed-point and ﬂoating-point number representations are discussed and

their diﬀerences are pointed out. Lab L5 in Chapter 5 gives suggestions on how one may cope

1.4. ORGANIZATION OF CHAPTERS 7

with the overﬂow problem. is lab experiment involves running an FIR ﬁlter on a smartphone

using ﬁxed-point arithmetic. 16 bits are used to quantize the double precision ﬂoating-point ﬁl-

ter coeﬃcients generated by a ﬁlter design package. Due to quantization, the frequency response

of the ﬁlter is aﬀected. e quantization word length can be adjusted in the settings menu and

the deviation of the frequency response magnitude can be observed in a graph displayed auto-

matically in the user interface. e settings menu allows the user to alter the quantization bits

to examine the deviation of the frequency response from the frequency response of the ﬂoating-

point implementation. In addition, due to quantization, overﬂows may occur depending on the

number of coeﬃcients. is experiment shows how scaling can be used to overcome overﬂows

by scaling down input samples and scaling back up output samples generated by the ﬁlter.

Chapters 6 and 7 discuss common ﬁlters used in digital signal processing applications.

Lab L6 in Chapter 6 covers FIR (ﬁnite impulse response) ﬁltering and Lab L7 in Chapter 7

shows how adaptive ﬁltering can be used to perform system identiﬁcation. e experiment in

Lab L7 exhibits adaptive ﬁltering where an adaptive FIR ﬁlter based on the least mean squares

(LMS) coeﬃcient update is implemented to match the output of an IIR (inﬁnite impulse re-

sponse) ﬁlter. e error between the output of the adaptive FIR ﬁlter and the IIR ﬁlter for an

input signal is measured and displayed on the smartphone screen in real-time as the app runs.

Over time the error between the two outputs converges toward zero. e user can experiment

with the rate of convergence by altering the adaptive ﬁlter order through the settings menu with-

out needing to recompile the code. As the ﬁlter order is increased, it can be observed that the

convergence rate also increases. e drawback of increasing the ﬁlter order, that is an increase

in the processing time, can also be observed. is experiment allows one to see how a tradeoﬀ

between convergence rate and real-time throughput can be established.

Chapter 8 covers frequency domain transforms and their implementation using frame-

based processing. Lab L8 explores the computational complexity of Fourier transform algo-

rithms and shows the utilization of Fourier transform for solving linear systems. e ﬁrst part

of this lab experiment compares the computational complexity of discrete Fourier transform

(DFT) and fast Fourier transform (FFT) by ﬁrst computing the DFT directly, having the com-

putational complexity of O.N

/, and then via FFT, having the computational complexity of

O.N log N /. In the second part of this lab, a ﬁlter is implemented in the frequency domain by

using Fourier transform three times. Frequency domain ﬁltering is done by complex multipli-

cation between two transformed signals. is approach is observed to be more computationally

eﬃcient than convolution when the length of the ﬁlter is made long.

Code eﬃciency issues are addressed in Chapter 9, in which optimization techniques, as

well as the use of intrinsics to access hardware features of the ARM processor, are discussed.

Lab L9 in this chapter provides a walkthrough of optimization techniques and their impact

on a signal processing app. In this lab experiment, the steps one can take to speed up code

execution on a smartphone target are covered. ese steps include changing compiler settings,

writing eﬃcient C code, and using architecture-speciﬁc functions for the ARM processor. e

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Software Package of Lab Codes

Create new playlist

Sign In

Sign Up

Table of Contents for
Software Package of Lab Codes