90 5. FIXED-POINT VS. FLOATING-POINT
L5.2 NEON SIMD COPROCESSOR
e NEON coprocessor capability can be accessed by either using assembly instructions or by
using C intrinsic functions. Here, intrinsics are used. NEON support is available when tar-
geting more recent ARM processors. When using an Android target, this is done by setting
armeabi-v7a in the abiFilter directive and -mfpu=neon in the cFlags directive of the ndk
section of the main build.gradle file. NEON intrinsics can then be used by adding the header
arm_neon.h to the list of imports in your code. On iOS targets, the only step required to use
NEON is adding the arm_neon.h header to the list of imports. NEON is a vector-based co-
processor, on which vectors can be processed. For each element, or lane, of a vector, the same
operation is performed on all the elements. A listing of NEON intrinsics is available in [4].
e following example shows the procedure which performs a multiply-subtract operation
using the NEON coprocessor intrinsics:
float32x4_t operandA; //quadword register
float32x4_t operandB; //quadword register
float32x4_t operandC; //quadword register
float32x4_t temp; //quadword register
float32_t neonResult[4] = {0,0,0,0}; //result vector
float32_t neonInputA[4] = {1.0, 2.0, 3.0, 4.0}; //input vector
float32_t neonInputB[4] = {5.0, 6.0, 7.0, 8.0}; //input vector
float32_t neonInputC[4] = {9.0, 10.0, 11.0, 12.0}; //input vector
operandA = vld1q_f32(neonInputA); //load A into neon quadword register
operandB = vld1q_f32(neonInputB); //load B into neon quadword register
operandC = vld1q_f32(neonInputB); //load B into neon quadword register
temp = vmlsq_f32(operandA, operandB, operandC); //compute temp=A-B*C
vst1q_f32(neonResult, temp); //write back the result
In the above code, the variables of type float32x4_t refer to NEON registers. e type spec-
ifies that the register holds four 32-bit floating-point numbers. Since there is a total of 128 bits
in the registers, they are referred to as quadword (Q) registers. NEON registers containing 64
bits are referred to as doubleword (D) registers. e NEON register bank is described in more
detail in [5] and later in Chapter 9.
If the instructions are for operating on quadword registers, the suffix “q” is required to be
added to the instruction intrinsic (as indicated above), otherwise, the registers will be assumed
to be doubleword. e data type of the instruction needs to be specified as an additional _{type}
suffix to the instruction. Supported types include 8-, 16-, 32-, and 64-bit signed and unsigned
integers, as well as 32-bit floating-point. A complete listing of data types is available in [6].
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.102.182