Chapter 8. Writing an LLVM Backend

In this chapter, we will cover the following recipes:

  • Defining registers and register sets
  • Defining the calling convention
  • Defining the instruction set
  • Implementing frame lowering
  • Printing an instruction
  • Selecting an instruction
  • Adding instruction encoding
  • Supporting a subtarget
  • Lowering to multiple instructions
  • Registering a target

Introduction

The ultimate goal of a compiler is to produce a target code, or an assembly code that can be converted into object code and executed on the actual hardware. To generate the assembly code, the compiler needs to know the various aspects of the architecture of the target machine—the registers, instruction set, calling convention, pipeline, and so on. There are lots of optimizations that can be done in this phase as well.

LLVM has its own way of defining the target machine. It uses tablegen to specify the target registers, instructions, calling convention, and so on. The tablegen function eases the way we describe a large set of architecture properties in a programmatic way.

LLVM has a pipeline structure for the backend, where instructions travel through phases like this; from the LLVM IR to SelectionDAG, then to MachineDAG, then to MachineInstr, and finally to MCInst.

The IR is converted into SelectionDAG (DAG stands for Directed Acyclic Graph). Then SelectionDAG legalization occurs where illegal instructions are mapped on the legal operations permitted by the target machine. After this stage, SelectionDAG is converted to MachineDAG, which is basically an instruction selection supported by the backend.

CPUs execute a linear sequence of instructions. The goal of the scheduling step is to linearize the DAG by assigning an order to its operations. LLVM's code generator employs clever heuristics (such as register pressure reduction) to try and produce a schedule that will result in faster code. Register allocation policies also play an important role in producing better LLVM code.

This chapter describes how to build an LLVM toy backend from scratch. By the end of this chapter, we will be able to generate assembly code for a sample toy backend.

A sample backend

The sample backend considered in this chapter is a simple RISC-type architecture, with a few registers (say r0-r3), a stack pointer (sp), and a link register (lr), for storing the return address.

The calling convention of this toy backend is similar to the ARM architecture—arguments passed to the function will be stored in register sets r0-r1, and the return value will be stored in r0.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.220.16