134 9. CODE OPTIMIZATION
code size. e O0 option denotes no optimization flags; O1 enables a subset of options;
O2 enables more options adding to the ones enabled by O1; and O3 includes all the opti-
mizations added by O1 and O2. e option Ofast enables optimizations that may result in
variables getting truncated or rounded incorrectly for floating-point math operations. For most
cases, the O3 option produces the best computational efficiency outcome.
When using Android Studio, the options for C code libraries need to be set using the
build.gradle file of the app. e optimization flags can be set within the ndk block using the
cFlags directive. An example using the O3 optimization follows:
ndk {
moduleName "yourLibrary"
abiFilter "armeabi"
ldLibs "log"
cFlags "-O3"
}
When using Xcode, all options for C code libraries can be set within the Build Settings of
the app by changing the Optimization Level under the Apple LLVM 6.1—Code Generation
section.
9.4 EFFICIENT C CODE WRITING
e compiler automatically performs common code optimization changes, such as loop reversal
or changing division by a constant to multiplication by the reciprocal of the constant. us, it
may only be necessary to further improve code efficiency by refactoring or manually implement-
ing architecture specific features such as SIMD instructions. Let us examine the changes that
can be made to the above linear convolution code to improve its computational efficiency or
performance.
For the FIR filter to work properly, it is required to store a sufficient number of previ-
ous input samples in memory. Because the generic ARM processor does not support circular
buffering, this can be accomplished by using two loops to shift previous samples through an
array structure in memory as follows:
for(i=0; i<fir->numCoefficients; i++) {
fir->window[i] = fir->window[fir->frameSize + i];
}
for(i=0; i<fir->frameSize; i++) {
fir->window[fir->numCoefficients + i] = input[i];
}
9.4. EFFICIENT C CODE WRITING 135
e array window is stored in heap memory using the previously defined FIRFilter structure
as these values need to be retained between calls to the compute method. Memory allocation is
time consuming and multiple repeated allocations should be avoided if possible.
Another way to improve code performance is to reduce the logic necessary for the loop to
operate. Although the above two loops may appear fine, it still takes extra operations to compute
the array index and thus the memory address of the desired value. A method involving pointer
manipulation can be used as shown in the following code block:
void computeFIR(FIRFilter* fir, float* input) {
int i, j;
float temp;
float* windowPtr = fir->window;
for(i=0; i<fir->numCoefficients; i++) {
*windowPtr = windowPtr[fir->frameSize];
windowPtr++;
}
for(i=0; i<fir->frameSize; i++) {
temp = 0;
*windowPtr = input[i];
for(j=0; j<fir->numCoefficients; j++) {
temp += windowPtr[-j] * fir->coefficients[j];
}
windowPtr++;
fir->result[i] = temp;
}
}
Using this technique, the memory address of the array is loaded one time before variable over-
writes or computations take place. Coming out of the shifting loop, the pointer windowPtr
refers to the memory location of the first array index that receives a sample from the new frame
of audio data due to the post-update incrementing. Using the pointer also removes the need for
some logic to accomplish array indexing. In terms of actual instructions generated by the com-
piler, this version of the code has six operations in the second loop as opposed to the original
version of the code having ten operations. Also note, unlike the previous case where the window
array was accessed from low index values to high index values, the window array is now being
accessed in reverse order.
e instructions to compute the result can be generalized into core instructions, e.g., the
multiply-accumulate instruction in linear convolution. Supporting instructions, which add com-
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.36.141