In this recipe, we will briefly discuss a transformation pass that deals with memory optimization.
memcpy
optimization pass:$ cat memcopytest.ll @cst = internal constant [3 x i32] [i32 -1, i32 -1, i32 -1], align 4 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind declare void @foo(i32*) nounwind define void @test1() nounwind { %arr = alloca [3 x i32], align 4 %arr_i8 = bitcast [3 x i32]* %arr to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arr_i8, i8* bitcast ([3 x i32]* @cst to i8*), i64 12, i32 4, i1 false) %arraydecay = getelementptr inbounds [3 x i32], [3 x i32]* %arr, i64 0, i64 0 call void @foo(i32* %arraydecay) nounwind ret void }
memcpyopt
pass on the preceding test case:$ opt -memcpyopt -S memcopytest.ll ; ModuleID = ' memcopytest.ll' @cst = internal constant [3 x i32] [i32 -1, i32 -1, i32 -1], align 4 ; Function Attrs: nounwind declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i32, i1) #0 ; Function Attrs: nounwind declare void @foo(i32*) #0 ; Function Attrs: nounwind define void @test1() #0 { %arr = alloca [3 x i32], align 4 %arr_i8 = bitcast [3 x i32]* %arr to i8* call void @llvm.memset.p0i8.i64(i8* %arr_i8, i8 -1, i64 12, i32 4, i1 false) %arraydecay = getelementptr inbounds [3 x i32]* %arr, i64 0, i64 0 call void @foo(i32* %arraydecay) #0 ret void } ; Function Attrs: nounwind declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1) #0 attributes #0 = { nounwind }
The Memcpyopt
pass deals with eliminating the memcpy
calls wherever possible, or transforms them into other calls.
Consider this memcpy
call:
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arr_i8, i8* bitcast ([3 x i32]* @cst to i8*), i64 12, i32 4, i1 false)
.
In the preceding test case, this pass converts it into a memset
call:
call void @llvm.memset.p0i8.i64(i8* %arr_i8, i8 -1, i64 12, i32 4, i1 false)
If we look into the source code of the pass, we realize that this transformation is brought about by the tryMergingIntoMemset
function in the MemCpyOptimizer.cpp
file in llvm/lib/Transforms/Scalar
.
The tryMergingIntoMemset
function looks for some other pattern to fold away when scanning forward over instructions. It looks for stores in the neighboring memory and, on seeing consecutive ones, it attempts to merge them together into memset
.
The processMemSet
function looks out for any other neighboring memset
to this memset
, which helps us widen out the memset
call to create a single larger store.
To see the details of the various types of memory optimization passes, go to http://llvm.org/docs/Passes.html#memcpyopt-memcpy-optimization.
3.143.203.96