The life of an LLVM IR instruction

In previous chapters, we saw how high-level language instructions, statements, logical blocks, function calls, loops, and so on get transformed into the LLVM IR. Various optimization passes then process the IR to make it more optimal. The IR generated is in the SSA form and, in abstract format, almost independent of any high- or low-level language constraints, which facilitates optimization passes running on it. There might be some optimizations that are target-specific and take place later, when the IR gets converted into machine instructions.

After we get an optimal LLVM IR, the next phase is to convert it into target-machine-specific instructions. LLVM uses the SelectionDAG approach to convert the IR into machine instructions. The Linear IR is converted into SelectionDAG, a DAG that represents instructions as nodes. The SDAG then goes through various phases:

  • The SelectionDAG is created out of LLVM IR
  • Legalizing SDAG nodes
  • DAG combine optimization
  • Instruction selection from the target instruction
  • Scheduling and emitting a machine instruction
  • Register allocation—SSA destruction, register assignment, and register spilling
  • Emitting code

All the preceding stages are modularized in LLVM.

C Code to LLVM IR

The first step is to convert the front end language example to LLVM IR. Let's take an example:

    int test (int a, int b, int c) {
            return c/(a+b);
    }

Its LLVM IR will be as follows:

define i32 @test(i32 %a, i32 %b, i32 %c) {
      %add = add nsw i32 %a, %b
      %div = sdiv i32 %add, %c
      return i32 %div
    }

IR optimization

The IR then goes through various optimization passes, as described in previous chapters. The IR, in the transformation phase, goes through the InstCombiner::visitSDiv() function in the InstCombine pass. In that function, it also goes through the SimplifySDivInst() function and tries to check whether an opportunity exists to further simplify the instruction.

LLVM IR to SelectionDAG

After the IR transformations and optimizations are over, the LLVM IR instruction passes through a Selection DAG node incarnation. Selection DAG nodes are created by the SelectionDAGBuilder class. The SelectionDAGBuilder::visit() function call from the SelectionDAGISel class visits each IR instruction for creating an SDAGNode node. The method that handles an SDiv instruction is SelectionDAGBuilder::visitSDiv. It requests a new SDNode node from the DAG with theISD::SDIV opcode, which then becomes a node in the DAG.

SelectionDAG legalization

The SelectionDAG node created may not be supported by the target architecture. In the initial phase of Selection DAG, these unsupported nodes are called illegal. Before the SelectionDAG machinery actually emits machine instructions from the DAG nodes, these undergo a few other transformations, legalization being one of the important phases.

The legalization of SDNode involves type and operation legalization. The target-specific information is conveyed to the target-independent algorithms via an interface called TargetLowering. This interface is implemented by the target and, describes how LLVM IR instructions should be lowered to legal SelectionDAG operations. For instance, x86 lowering is implemented in the X86TargetLowering interface. The setOperationAction() function specifies whether the ISD node needs to be expanded or customized by operation legalization. When SelectionDAGLegalize::LegalizeOp sees the expand flag, it replaces the SDNode node with the parameter specified in the setOperationAction() call.

Conversion from target-independent DAG to machine DAG

Now that we have legalized the instruction, SDNode should be converted to MachineSDNode. The machine instructions are described in a generic table-based fashion in the target description .td files. Using tablegen, these files are then converted into .inc files that have registers/instructions as enums to refer to in the C++ code. Instructions can be selected by an automated selector, SelectCode, or they can be handled specifically by writing a customized Select function in the SelectionDAGISel class. The DAG node created at this step is a MachineSDNode node, a subclass of SDNode that holds the information required to construct an actual machine instruction but is still in the DAG node form.

Scheduling instructions

A machine executes a linear set of instructions. So far, we have had machine instructions that are still in the DAG form. To convert a DAG into a linear set of instructions, a topological sort of the DAG can yield the instructions in linear order. However, the linear set of instructions generated might not result in the most optimized code, and may cause execution delays due to dependencies among instructions, register pressure, and pipeline stalling issues. Therein comes the concept of scheduling instructions. Since each target has its own set of registers and customized pipelining of the instructions, each target has its own hook for scheduling and calculating heuristics to produce optimized, faster code. After calculating the best possible way to arrange instructions, the scheduler emits the machine instructions in the machine basic block, and finally destroys the DAG.

Register allocation

The registers allocated are virtual registers after the machine instructions are emitted. Practically, an infinite number of virtual registers can be allocated, but the actual target has a limited number of registers. These limited registers need to be allocated efficiently. If this is not done, some registers have to be spilled onto the memory, and this may result in redundant load/store operations. This will also result in wastage of CPU cycles, slowing down the execution as well as increasing the memory footprint.

There are various register allocation algorithms. An important analysis is done when allocating registers—liveness of variables and live interval analysis. If two variables live in the same interval (that is, if there exists an interval interference), then they cannot be allocated the same register. An interference graph is created by analyzing liveness, and a graph coloring algorithm can be used to allocate the registers. This algorithm, however, takes quadratic time to run. Hence, it may result in longer compilation time.

LLVM employs a greedy approach for register allocation, where variables that have large live ranges are allocated registers first. Small ranges fit into the gaps of registers available, resulting in less spill weight. Spilling is a load-store operation that occurs because no registers are available to be allocated. Spill weight is the cost of operations involved in the spilling. Sometimes, live range splitting also takes place to accommodate variables into the registers.

Note that the instructions are in the SSA form before register allocation. Now, the SSA form cannot exist in the real world because of the limited number of registers available. In some types of architecture, some instructions require fixed registers.

Code emission

Now that the original high-level code has been translated into machine instructions, the next step is to emit the code. LLVM does this in two ways; the first is JIT, which directly emits the code to the memory. The second way is by using the MC framework to emit assembly and object files for all backend targets.The LLVMTargetMachine::addPassesToEmitFile function is responsible for defining the sequence of actions required to emit an object file. The actual MI-to-MCInst translation is done in the EmitInstruction function of the AsmPrinter interface. The static compiler tool, llc, generates assembly instructions for a target. Object file (or assembly code) emission is done by implementing the MCStreamer interface.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.176.72