Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Writing a pass for memory optimization

In this recipe, we will briefly discuss a transformation pass that deals with memory optimization.

Getting ready

For this recipe, you will need the opt tool installed.

How to do it…

Write the test code on which we will run the memcpy optimization pass:

$ cat memcopytest.ll
@cst = internal constant [3 x i32] [i32 -1, i32 -1, i32 -1], align 4

declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind
declare void @foo(i32*) nounwind

define void @test1() nounwind {
  %arr = alloca [3 x i32], align 4
  %arr_i8 = bitcast [3 x i32]* %arr to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arr_i8, i8* bitcast ([3 x i32]* @cst to i8*), i64 12, i32 4, i1 false)
  %arraydecay = getelementptr inbounds [3 x i32], [3 x i32]* %arr, i64 0, i64 0
  call void @foo(i32* %arraydecay) nounwind
  ret void
}

Run the memcpyopt pass on the preceding test case:

$ opt -memcpyopt -S memcopytest.ll
; ModuleID = ' memcopytest.ll'

@cst = internal constant [3 x i32] [i32 -1, i32 -1, i32 -1], align 4

; Function Attrs: nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i32, i1) #0

; Function Attrs: nounwind
declare void @foo(i32*) #0

; Function Attrs: nounwind
define void @test1() #0 {
  %arr = alloca [3 x i32], align 4
  %arr_i8 = bitcast [3 x i32]* %arr to i8*
  call void @llvm.memset.p0i8.i64(i8* %arr_i8, i8 -1, i64 12, i32 4, i1 false)
  %arraydecay = getelementptr inbounds [3 x i32]* %arr, i64 0, i64 0
  call void @foo(i32* %arraydecay) #0
  ret void
}

; Function Attrs: nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) #0

attributes #0 = { nounwind }

How it works…

The Memcpyopt pass deals with eliminating the memcpy calls wherever possible, or transforms them into other calls.

Consider this memcpy call:

call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arr_i8, i8* bitcast ([3 x i32]* @cst to i8*), i64 12, i32 4, i1 false).

In the preceding test case, this pass converts it into a memset call:

call void @llvm.memset.p0i8.i64(i8* %arr_i8, i8 -1, i64 12, i32 4, i1 false)

If we look into the source code of the pass, we realize that this transformation is brought about by the tryMergingIntoMemset function in the MemCpyOptimizer.cpp file in llvm/lib/Transforms/Scalar.

The tryMergingIntoMemset function looks for some other pattern to fold away when scanning forward over instructions. It looks for stores in the neighboring memory and, on seeing consecutive ones, it attempts to merge them together into memset.

The processMemSet function looks out for any other neighboring memset to this memset, which helps us widen out the memset call to create a single larger store.

Table of Contents for
Writing a pass for memory optimization

Writing a pass for memory optimization

Getting ready

How to do it…

How it works…

See also

Table of Contents for Writing a pass for memory optimization

Create new playlist

Sign In

Sign Up

Writing a pass for memory optimization

Getting ready

How to do it…

How it works…

See also

Table of Contents for
Writing a pass for memory optimization