Chapter 2: Exploring LLVM's Build System Features

In the previous chapter, we saw that LLVM's build system is a behemoth: it contains hundreds of build files with thousands of interleaving build dependencies. Not to mention, it contains targets that require custom build instructions for heterogeneous source files. These complexities drove LLVM to adopt some advanced build system features and, more importantly, a more structural build system design. In this chapter, our goal will be to learn about some important directives for the sake of writing more concise and expressive build files when doing both in-tree and out-of-tree LLVM developments.

In this chapter, we will cover the following main topics:

  • Exploring a glossary of LLVM's important CMake directives
  • Integrating LLVM via CMake in out-of-tree projects

Technical requirements

Similar to Chapter 1, Saving Resources When Building LLVM, you might want to have a copy of LLVM built from its source. Optionally, since this chapter will touch on quite a lot of CMake build files, you may wish to prepare a syntax highlighting plugin for CMakeLists.txt (for example, VSCode's CMake Tools plugin). All major IDEs and editors should have it off-the-shelf. Also, familiarity with basic CMakeLists.txt syntax is preferable.

All the code examples in this chapter can be found in this book's GitHub repository: https://github.com/PacktPublishing/LLVM-Techniques-Tips-and-Best-Practices/tree/main/Chapter02.

Exploring a glossary of LLVM's important CMake directives

LLVM has switched to CMake from GNU autoconf due to higher flexibility in terms of choosing underlying build systems. Ever since, LLVM has come up with many custom CMake functions, macros, and rules to optimize its own usage. This section will give you an overview of the most important and frequently used ones among them. We will learn how and when to use them.

Using the CMake function to add new libraries

Libraries are the building blocks of the LLVM framework. However, when writing CMakeLists.txt for a new library, you shouldn't use the normal add_library directive that appears in normal CMakeLists.txt files, as follows:

# In an in-tree CMakeLists.txt file…

add_library(MyLLVMPass SHARED

  MyPass.cpp) # Do NOT do this to add a new LLVM library

There are several drawbacks of using the vanilla add_library here, as follows:

  • As shown in Chapter 1, Saving Resources When Building LLVM, LLVM prefers to use a global CMake argument (that is, BUILD_SHARED_LIBS) to control whether all its component libraries should be built statically or dynamically. It's pretty hard to do that using the built-in directives.
  • Similar to the previous point, LLVM prefers to use a global CMake arguments to control some compile flags, such as whether or not to enable Runtime Type Information (RTTI) and C++ exception handling in the code base.
  • By using custom CMake functions/macros, LLVM can create its own component system, which provides a higher level of abstraction for developers to designate build target dependencies in an easier way.

Therefore, you should always use the add_llvm_component_library CMake function shown here:

# In a CMakeLists.txt

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

Here, LLVMFancyOpt is the final library name and FancyOpt.cpp is the source file.

In regular CMake scripts, you can use target_link_libraries to designate a given target's library dependencies, and then use add_dependencies to assign dependencies among different build targets to create explicit build orderings. There is an easier way to do those tasks when you're using LLVM's custom CMake functions to create library targets.

By using the LINK_COMPONENTS argument in add_llvm_component_library (or add_llvm_library, which is the underlying implementation of the former one), you can designate the target's linked components:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  LINK_COMPONENTS

  Analysis ScalarOpts)

Alternatively, you can do the same thing with the LLVM_LINK_COMPONENTS variable, which is defined before the function call:

set(LLVM_LINK_COMPONENTS

    Analysis ScalarOpts)

add_llvm_component_library(LLVMFancyOpt

   FancyOpt.cpp)

Component libraries are nothing but normal libraries with a special meaning when it comes to the LLVM building blocks you can use. They're also included in the gigantic libLLVM library if you choose to build it. The component names are slightly different from the real library names. If you need the mapping from component names to library names, you can use the following CMake function:

llvm_map_components_to_libnames(output_lib_names

  <list of component names>)

If you want to directly link against a normal library (the non-LLVM component one), you can use the LINK_LIBS argument:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  LINK_LIBS

  ${BOOST_LIBRARY})

To assign general build target dependencies to a library target (equivalent to add_dependencies), you can use the DEPENDS argument:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  DEPENDS

  intrinsics_gen)

intrinsics_gen is a common target representing the procedure of generating header files containing LLVM intrinsics definitions.

Adding one build target per folder

Many LLVM custom CMake functions have a pitfall that involves source file detection. Let's say you have a directory structure like this:

/FancyOpt

  |___ FancyOpt.cpp

  |___ AggressiveFancyOpt.cpp

  |___ CMakeLists.txt

Here, you have two source files, FancyOpt.cpp and AggressiveFancyOpt.cpp. As their names suggest, FancyOpt.cpp is the basic version of this optimization, while AggressiveFancyOpt.cpp is an alternative, more aggressive version of the same functionality. Naturally, you will want to split them into separate libraries so that users can choose if they wish to include the more aggressive one in their normal workload. So, you might write a CMakeLists.txt file like this:

# In /FancyOpt/CMakeLists.txt

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

add_llvm_component_library(LLVMAggressiveFancyOpt

  AggressiveFancyOpt.cpp)

Unfortunately, this would generate error messages telling you something to the effect of Found unknown source AggressiveFancyOpt.cpp … when processing the first add_llvm_component_library statement.

LLVM's build system enforces a stricter rule to make sure that all C/C++ source files in the same folder are added to the same library, executable, or plugin. To fix this, it is necessary to split either file into a separate folder, like so:

/FancyOpt

  |___ FancyOpt.cpp

  |___ CMakeLists.txt

  |___ /AggressiveFancyOpt

       |___ AggressiveFancyOpt.cpp

       |___ CMakeLists.txt

In /FancyOpt/CMakeLists.txt, we have the following:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

add_subdirectory(AggressiveFancyOpt)

Finally, in /FancyOpt/AggressiveFancyOpt/CMakeLists.txt, we have the following:

add_llvm_component_library(LLVMAggressiveFancyOpt

  AggressiveFancyOpt.cpp)

These are the essentials of adding build targets for (component) libraries using LLVM's custom CMake directives. In the next two sections, we will show you how to add executable and Pass plugin build targets using a different set of LLVM-specific CMake directives.

Using the CMake function to add executables and tools

Similar to add_llvm_component_library, to add a new executable target, we can use add_llvm_executable or add_llvm_tool:

add_llvm_tool(myLittleTool

  MyLittleTool.cpp)

These two functions have the same syntax. However, only targets created by add_llvm_tool will be included in the installations. There is also a global CMake variable, LLVM_BUILD_TOOLS, that enables/disables those LLVM tool targets.

Both functions can also use the DEPENDS argument to assign dependencies, similar to add_llvm_library, which we introduced earlier. However, you can only use the LLVM_LINK_COMPONENTS variable to designate components to link.

Using the CMake function to add Pass plugins

While we will cover Pass plugin development later in this book, adding a build target for a Pass plugin couldn't be any easier than now (compared to earlier LLVM versions, which were still using add_llvm_library with some special arguments). We can simply use the following command:

add_llvm_pass_plugin(MyPass

   HelloWorldPass.cpp)

The LINK_COMPONENTS, LINK_LIBS, and DEPENDS arguments are also available here, with the same usages and functionalities as in add_llvm_component_library.

These are some of the most common and important LLVM-specific CMake directives. Using these directives can not only make your CMake code more concise but also help synchronize it with LLVM's own build system, in case you want to do some in-tree development. In the next section, we will show you how to integrate LLVM into an out-of-tree CMake project, and leverage the knowledge we learned in this chapter.

In-tree versus out-of-tree development

In this book, in-tree development means contributing code directly to the LLVM project, such as fixing LLVM bugs or adding new features to the existing LLVM libraries. Out-of-tree development, on the other hand, either represents creating extensions for LLVM (writing an LLVM pass, for example) or using LLVM libraries in some other projects (using LLVM's code generation libraries to implement your own programming language, for example).

Understanding CMake integration for out-of-tree projects

Implementing your features in an in-tree project is good for prototyping, since most of the infrastructure is already there. However, there are many scenarios where pulling the entire LLVM source tree into your code base is not the best idea, compared to creating an out-of-tree project and linking it against the LLVM libraries. For example, you only want to create a small code refactoring tool using LLVM's features and open source it on GitHub, so telling developers on GitHub to download a multi-gigabyte LLVM source tree along with your little tool might not be a pleasant experience.

There are at least two ways to configure out-of-tree projects to link against LLVM:

  • Using the llvm-config tool
  • Using LLVM's CMake modules

Both approaches help you sort out all the details, including header files and library paths. However, the latter creates more concise and readable CMake scripts, which is preferable for projects that are already using CMake. This section will show the essential steps of using LLVM's CMake modules to integrate it into an out-of-tree CMake project.

First, we need to prepare an out-of-tree (C/C++) CMake project. The core CMake functions/macros we discussed in the previous section will help us work our way through this. Let's look at our steps:

  1. We are assuming that you already have the following CMakeLists.txt skeleton for a project that needs to be linked against LLVM libraries:

    project(MagicCLITool)

    set(SOURCE_FILES

        main.cpp)

    add_executable(magic-cli

      ${SOURCE_FILES})

    Regardless of whether you're trying to create a project generating executable, just like the one we saw in the preceding code block, or other artifacts such as libraries or even LLVM Pass plugins, the biggest question now is how to get include path, as well as library path.

  2. To resolve include path and library path, LLVM provides the standard CMake package interface for you to use the find_package CMake directive to import various configurations, as follows:

    project(MagicCLITool)

    find_package(LLVM REQUIRED CONFIG)

    include_directories(${LLVM_INCLUDE_DIRS})

    link_directories(${LLVM_LIBRARY_DIRS})

    To make the find_package trick work, you need to supply the LLVM_DIR CMake variable while invoking the CMake command for this project:

    $ cmake -DLLVM_DIR=<LLVM install path>/lib/cmake/llvm …

    Make sure it's pointing to the lib/cmake/llvm subdirectory under LLVM install path.

  3. After resolving the include path and library, it's time to link the main executable against LLVM's libraries. LLVM's custom CMake functions (for example, add_llvm_executable) will be really useful here. But first, CMake needs to be able to find those functions.

    The following snippet imports LLVM's CMake module (more specifically, the AddLLVM CMake module), which contains those LLVM-specific functions/macros that we introduced in the previous section:

    find_package(LLVM REQUIRED CONFIG)

    list(APPEND CMAKE_MODULE_PATH ${LLVM_CMAKE_DIR})

    include(AddLLVM)

  4. The following snippet adds the executable build target using the CMake function we learned about in the previous section:

    find_package(LLVM REQUIRED CONFIG)

    include(AddLLVM)

    set(LLVM_LINK_COMPONENTS

      Support

      Analysis)

    add_llvm_executable(magic-cli

      main.cpp)

  5. Adding the library target makes no difference:

    find_package(LLVM REQUIRED CONFIG)

    include(AddLLVM)

    add_llvm_library(MyMagicLibrary

      lib.cpp

      LINK_COMPONENTS

      Support Analysis)

  6. Finally, add the LLVM Pass plugin:

    find_package(LLVM REQUIRED CONFIG)

    include(AddLLVM)

    add_llvm_pass_plugin(MyMagicPass

      ThePass.cpp)

  7. In practice, you also need to be careful of LLVM-specific definitions and the RTTI setting:

    find_package(LLVM REQUIRED CONFIG)

    add_definitions(${LLVM_DEFINITIONS})

    if(NOT ${LLVM_ENABLE_RTTI})

      # For non-MSVC compilers

      set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-rtti")

    endif()

    add_llvm_xxx(source.cpp)

    This is especially true for the RTTI part because, by default, LLVM is not built with RTTI support, but normal C++ applications are. A compilation error will be thrown if there is an RTTI mismatch between your code and LLVM's libraries.

Despite the convenience of developing inside LLVM's source tree, sometimes, enclosing the entire LLVM source in your project might not be feasible. So, instead, we must create an out-of-tree project and integrate LLVM as a library. This section showed you how to integrate LLVM into your CMake-based out-of-tree projects and make good use of the LLVM-specific CMake directives we learned about in the Exploring a glossary of LLVM's important CMake directives section.

Summary

This chapter dug deeper into LLVM's CMake build system. We saw how to use LLVM's own CMake directives to write concise and effective build scripts, for both in-tree and out-of-tree development. Learning these CMake skills can make your LLVM development more efficient and provide you with more options to engage LLVM features with other existing code bases or custom logic.

In the next chapter, we will introduce another important infrastructure in the LLVM project known as the LLVM LIT, which is an easy-to-use yet general framework for running various kinds of tests.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.81.94