Lab Exercises

90 5. FIXED-POINT VS. FLOATING-POINT

L5.2 NEON SIMD COPROCESSOR

e NEON coprocessor capability can be accessed by either using assembly instructions or by

using C intrinsic functions. Here, intrinsics are used. NEON support is available when tar-

geting more recent ARM processors. When using an Android target, this is done by setting

armeabi-v7a in the abiFilter directive and -mfpu=neon in the cFlags directive of the ndk

section of the main build.gradle ﬁle. NEON intrinsics can then be used by adding the header

arm_neon.h to the list of imports in your code. On iOS targets, the only step required to use

NEON is adding the arm_neon.h header to the list of imports. NEON is a vector-based co-

processor, on which vectors can be processed. For each element, or lane, of a vector, the same

operation is performed on all the elements. A listing of NEON intrinsics is available in [4].

e following example shows the procedure which performs a multiply-subtract operation

using the NEON coprocessor intrinsics:

float32x4_t operandA; //quadword register

float32x4_t operandB; //quadword register

float32x4_t operandC; //quadword register

float32x4_t temp; //quadword register

float32_t neonResult[4] = {0,0,0,0}; //result vector

float32_t neonInputA[4] = {1.0, 2.0, 3.0, 4.0}; //input vector

float32_t neonInputB[4] = {5.0, 6.0, 7.0, 8.0}; //input vector

float32_t neonInputC[4] = {9.0, 10.0, 11.0, 12.0}; //input vector

operandA = vld1q_f32(neonInputA); //load A into neon quadword register

operandB = vld1q_f32(neonInputB); //load B into neon quadword register

operandC = vld1q_f32(neonInputB); //load B into neon quadword register

temp = vmlsq_f32(operandA, operandB, operandC); //compute temp=A-B*C

vst1q_f32(neonResult, temp); //write back the result

In the above code, the variables of type float32x4_t refer to NEON registers. e type spec-

iﬁes that the register holds four 32-bit ﬂoating-point numbers. Since there is a total of 128 bits

in the registers, they are referred to as quadword (Q) registers. NEON registers containing 64

bits are referred to as doubleword (D) registers. e NEON register bank is described in more

detail in [5] and later in Chapter 9.

If the instructions are for operating on quadword registers, the suﬃx “q” is required to be

added to the instruction intrinsic (as indicated above), otherwise, the registers will be assumed

to be doubleword. e data type of the instruction needs to be speciﬁed as an additional _{type}

suﬃx to the instruction. Supported types include 8-, 16-, 32-, and 64-bit signed and unsigned

integers, as well as 32-bit ﬂoating-point. A complete listing of data types is available in [6].

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Lab Exercises