LLVM intrinsics

An intrinsic function is a function built in to the compiler. The compiler knows how to best implement the functionality in the most optimized way for these functions and replaces with a set of machine instruction for a particular backend. Often, the code for the function is inserted inline thus avoiding the overhead of function call (In many cases, we do call the library function. For example, for the functions listed in http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics we make a call to libc). These are also called built-in functions for other compilers.

In LLVM these intrinsics are introduced during code optimization at IR level (Intrinsics written in program code can be emitted through frontend directly). These function names will start with a prefix "llvm.", which is a reserved word in LLVM. These functions are always external and a user cannot specify the body for these functions in his/her code. In our code, we can only call these intrinsic functions.

In this section, we will not go much deep into details. We will take an example and see how LLVM optimizes certain part of code with its own intrinsic functions.

Let's write a simple code:

$ cat intrinsic.cpp
int func()
{
        int a[5];

        for (int i = 0; i != 5; ++i)
                a[i] = 0;

        return a[0];
}

Now use Clang to generate the IR file. Using the command given below, we will get the intrinsic.ll file that contains the unoptimized IR without any intrinsic function.

$ clang -emit-llvm -S intrinsic.cpp

Now, use the opt tool to optimize the IR with O1 level of optimization.

$ opt -O1 intrinsic.ll -S -o -
; ModuleID = 'intrinsic.ll'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: nounwind readnone uwtable
define i32 @_Z4funcv() #0 {
  %a = alloca [5 x i32], align 16
  %a2 = bitcast [5 x i32]* %a to i8*
  call void @llvm.memset.p0i8.i64(i8* %a2, i8 0, i64 20, i32 16, i1 false)
  %1 = getelementptr inbounds [5 x i32], [5 x i32]* %a, i64 0, i64 0
  %2 = load i32, i32* %1, align 16
  ret i32 %2
}

; Function Attrs: nounwind argmemonly
declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) #1

The important optimization to be noted here is the call to LLVM intrinsic function llvm.memset.p0i8.i64 to fill the array with value 0. The intrinsic functions may be used to implement vectorization and parallelization in the code, leading to better code generation. It might call the most optimized version of the memset call from the libc library and may choose to completely omit this function if there is no usage of this.

The first argument in the call specifies the array "a", that is the destination array where the value needs to be filled. The second argument specifies the value to be filled. The third argument to the call is specification about number of bytes to be filled. The fourth argument specifies the alignment of destination value. The last argument is to determine whether this is a volatile operation or not.

There is a list of such intrinsic functions in LLVM, a list of which can be found at http://llvm.org/docs/LangRef.html#intrinsic-functions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.179.85