We started from LLVM IR in the first section and converted it to SelectioDAG
and then to MachineInstr
. Now, we need to emit this code. Currently, we have LLVM JIT and MC to do so. LLVM JIT is the traditional way of generating the object code for a target on the go directly in the memory. What we are more interested in is the LLVM MC layer.
The MC layer is responsible for generation of assembly file/object file from the MachineInstr
passed on to it from the previous step. In the MC Layer, the instructions are represented as MCInst
, which are lightweight, as in they don't carry much information about the program as MachineInstr
.
The code emission starts with the AsmPrinter
class, which is overloaded by the target specific AsmPrinter
class. This class deals with general lowering process by converting the MachineFunction
functions into MC label constructs by making use of the target specific MCInstLowering
interface(for x86 it is X86MCInstLower
class in the lib/Target/x86/X86MCInstLower.cpp
file).
Now, we have MCInst
instructions that are passed to MCStreamer
class for further step of generating either the assembly file or object code. Depending on the choice MCStreamer
makes use of its subclass MCAsmStreamer
to generate assembly code and MCObjectStreamer
to generate the object code.
The target specific MCInstPrinter
is called by MCAsmStreamer
to print the assembly instructions. To generate the binary code, the LLVM object code assembler is called by MCObjectStreamer
. The assembler in turn calls the MCCodeEmitter::EncodeInstruction()
to generate the binary instructions.
We must note that the MC Layer is one of the big difference between LLVM and GCC. GCC always outputs assembly and then needs an external assembler to transform this assembly into object files, whereas for LLVM using its own assembler we can easily print the instructions in binary and by putting some wraps around them can generate the object file directly. This not only guarantees that the output emitted in text or binary forms will be same but also saves time over GCC by removing the calls to external processes.
Now, let's take an example to look at the MC Instruction corresponding to assembly using the llc
tool. We make use of the same testcode test.ll
file used earlier in the chapter.
To view the MC Instructions, we need to pass the command-line option –asm-show-inst
option to llc
. It will show the MC instructions as assembly file comments.
llc test.ll -asm-show-inst -o - .text .file "test.ll" .globl test .align 16, 0x90 .type test,@function test: # @test .cfi_startproc # BB#0: movl %edx, %ecx # <MCInst #1674 MOV32rr # <MCOperand Reg:22> # <MCOperand Reg:24>> leal (%rdi,%rsi), %eax # <MCInst #1282 LEA64_32r # <MCOperand Reg:19> # <MCOperand Reg:39> # <MCOperand Imm:1> # <MCOperand Reg:43> # <MCOperand Imm:0> # <MCOperand Reg:0>> cltd # <MCInst #388 CDQ> idivl %ecx # <MCInst #903 IDIV32r # <MCOperand Reg:22>> retq # <MCInst #2465 RETQ # <MCOperand Reg:19>> .Lfunc_end0: .size test, .Lfunc_end0-test .cfi_endproc .section ".note.GNU-stack","",@progbits
We see the MCInst
and MCOperands
in the assembly comments. We can also view the binary encoding in assembly comments by passing the option –show-mc-encoding
to llc
.
$ llc test.ll -show-mc-encoding -o - .text .file "test.ll" .globl test .align 16, 0x90 .type test,@function test: # @test .cfi_startproc # BB#0: movl %edx, %ecx # encoding: [0x89,0xd1] leal (%rdi,%rsi), %eax # encoding: [0x8d,0x04,0x37] cltd # encoding: [0x99] idivl %ecx # encoding: [0xf7,0xf9] retq # encoding: [0xc3] .Lfunc_end0: .size test, .Lfunc_end0-test .cfi_endproc .section ".note.GNU-stack","",@progbits
3.133.122.68