In this recipe, we will see how tail call optimization is done in LLVM. Tail call optimization is a technique where the callee reuses the stack of the caller instead of adding a new stack frame to the call stack, hence saving stack space and the number of returns when dealing with mutually recursive functions.
We need to make sure of the following:
llc
tool must be installed in $PATH
tailcallopt
option must be enabled$ cat tailcall.ll declare fastcc i32 @tailcallee(i32 inreg %a1, i32 inreg %a2, i32 %a3, i32 %a4) define fastcc i32 @tailcaller(i32 %in1, i32 %in2) { %l1 = add i32 %in1, %in2 %tmp = tail call fastcc i32 @tailcallee(i32 inreg %in1, i32 inreg %in2, i32 %in1, i32 %l1) ret i32 %tmp }
llc
tool with the –tailcallopt
option on the test code to generate the assembly file with the tailcall-optimized code:$ llc -tailcallopt tailcall.ll
$ cat tailcall.s .text .file "tailcall.ll" .globl tailcaller .align 16, 0x90 .type tailcaller,@function tailcaller: # @tailcaller .cfi_startproc # BB#0: pushq %rax .Ltmp0: .cfi_def_cfa_offset 16 # kill: ESI<def> ESI<kill> RSI<def> # kill: EDI<def> EDI<kill> RDI<def> leal (%rdi,%rsi), %ecx # kill: ESI<def> ESI<kill> RSI<kill> movl %edi, %edx popq %rax jmp tailcallee # TAILCALL .Lfunc_end0: .size tailcaller, .Lfunc_end0-tailcaller .cfi_endproc .section ".note.GNU-stack","",@progbits
llc
tool, generate the assembly again but without using the -tailcallopt
option:$ llc tailcall.ll -o tailcall1.s
cat
command:$ cat tailcall1.s .text .file "tailcall.ll" .globl tailcaller .align 16, 0x90 .type tailcaller,@function tailcaller: # @tailcaller .cfi_startproc # BB#0: # kill: ESI<def> ESI<kill> RSI<def> # kill: EDI<def> EDI<kill> RDI<def> leal (%rdi,%rsi), %ecx # kill: ESI<def> ESI<kill> RSI<kill> movl %edi, %edx jmp tailcallee # TAILCALL .Lfunc_end0: .size tailcaller, .Lfunc_end0-tailcaller .cfi_endproc .section ".note.GNU-stack","",@progbits
Compare the two assemblies using a diff tool. We used the meld tool here:
The tail call optimization is a compiler optimization technique, which a compiler can use to make a call to a function and take up no additional stack space; we don't need to create a new stack frame for this function call. This happens if the last instruction executed in a function is a call to another function. A point to note is that the caller function now does not need the stack space; it simply calls a function (another function or itself) and returns whatever value the called function would have returned. This optimization can make recursive calls take up constant and limited space. In this optimization, the code might not always be in the form for which a tail call is possible. It tries and modifies the source to see whether a tail call is possible or not.
In the preceding test case, we see that a push-and-pop instruction is added due to tail call optimization. In LLVM, the tail call optimization is handled by the architecture-specific ISelLowering.cpp
file; for x86, it is the X86ISelLowering.cpp
file:
The code in function SDValue X86TargetLowering::LowerCall (…….) bool IsMustTail = CLI.CS && CLI.CS->isMustTailCall(); if (IsMustTail) { // Force this to be a tail call. The verifier rules are enough to ensure // that we can lower this successfully without moving the return address // around. isTailCall = true; } else if (isTailCall) { // Check if it's really possible to do a tail call. isTailCall = IsEligibleForTailCallOptimization(Callee, CallConv, isVarArg, SR != NotStructReturn, MF.getFunction()->hasStructRetAttr(), CLI.RetTy, Outs, OutVals, Ins, DAG);
The preceding code is used to call the IsEligibleForTailCallOptimization()
function when the tailcallopt
flag is passed. The IsEligibleForTailCallOptimization()
function decides whether or not the piece of code is eligible for tail call optimization. If it is, then the code generator will make the necessary changes.
18.118.37.154