LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries - Min-Yih Hsu - E-Book

LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries E-Book

Min-Yih Hsu

0,0
34,79 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Every programmer or engineer, at some point in their career, works with compilers to optimize their applications. Compilers convert a high-level programming language into low-level machine-executable code. LLVM provides the infrastructure, reusable libraries, and tools needed for developers to build their own compilers. With LLVM’s extensive set of tooling, you can effectively generate code for different backends as well as optimize them.
In this book, you’ll explore the LLVM compiler infrastructure and understand how to use it to solve different problems. You’ll start by looking at the structure and design philosophy of important components of LLVM and gradually move on to using Clang libraries to build tools that help you analyze high-level source code. As you advance, the book will show you how to process LLVM IR – a powerful way to transform and optimize the source program for various purposes. Equipped with this knowledge, you’ll be able to leverage LLVM and Clang to create a wide range of useful programming language tools, including compilers, interpreters, IDEs, and source code analyzers.
By the end of this LLVM book, you’ll have developed the skills to create powerful tools using the LLVM framework to overcome different real-world challenges.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 434

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries

Design powerful and reliable compilers using the latest libraries and tools from LLVM

Min-Yih Hsu

BIRMINGHAM—MUMBAI

LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Aaron Lazar

Publishing Product Manager: Shweta Bairoliya

Senior Editor: Nitee Shetty

Content Development Editor: Kinnari Chohan

Technical Editor: Pradeep Sahu

Copy Editor: Safis Editing

Project Coordinator: Deeksha Thakkar

Proofreader: Safis Editing

Indexer: Tejal Daruwale Soni

Production Designer: Jyoti Chauhan

First published: April 2021

Production reference: 1220421

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-83882-495-2

www.packt.com

To my parents.

– Min-Yih Hsu

Contributors

About the author

Min-Yih "Min" Hsu is a Ph.D. candidate in computer science at the University of California, Irvine. His research focuses on compiler engineering, code optimization, advanced hardware architectures, and system security. He has been an active member of the LLVM community since 2015 and has contributed numerous patches upstream. Min also dedicates his time to advocating LLVM and compiler engineering through various avenues, such as writing blog posts and giving talks at conferences. In his spare time, Min likes to explore a variety of different coffee beans and ways of brewing.

I want to thank the people who have supported me, especially my family and my academic advisors. I also want to thank the LLVM community for always being inclusive and kind to every member regardless of their background and origin.

About the reviewer

Suyog Sarda completed his B.Tech from the College of Engineering, Pune. His work so far has mostly been related to compilers. He is especially interested in the performance aspect of compilers. He has worked on domain-specific language for image processing for DSPs.

LLVM's modularity makes it interesting to learn and implement quickly according to the compiler's requirement. However, the documentation of LLVM is scattered. He hopes this book provides a consolidated overview of the LLVM compiler infrastructure.

Table of Contents

Preface

Section 1: Build System and LLVM-Specific Tooling

Chapter 1: Saving Resources When Building LLVM

Technical requirements

Cutting down building resources with better tooling

Replacing GNU Make with Ninja

Avoiding the use of the BFD linker

Tweaking CMake arguments

Choosing the right build type

Avoiding building all targets

Building as shared libraries

Splitting the debug info

Building an optimized version of llvm-tblgen

Using the new PassManager and Clang

Using GN for a faster turnaround time

Summary

Further reading

Chapter 2: Exploring LLVM's Build System Features

Technical requirements

Exploring a glossary of LLVM's important CMake directives

Using the CMake function to add new libraries

Using the CMake function to add executables and tools

Using the CMake function to add Pass plugins

Understanding CMake integration for out-of-tree projects

Summary

Chapter 3: Testing with LLVM LIT

Technical requirements

Using LIT in out-of-tree projects

Preparing for our example project

Writing LIT configurations

LIT internals

Learning useful FileCheck tricks

Preparing for our example project

Writing FileCheck directives

Exploring the TestSuite framework

Preparing for our example project

Importing code into llvm-test-suite

Summary

Further reading

Chapter 4: TableGen Development

Technical requirements

Introduction to TableGen syntax

Layout and records

Bang operators

Multiclass

The DAG data type

Writing a donut recipe in TableGen

Printing a recipe via the TableGen backend

TableGen's high-level workflow

Writing the TableGen backend

Integrating the RecipePrinter TableGen backend

Summary

Further reading

Section 2: Frontend Development

Chapter 5: Exploring Clang's Architecture

Technical requirements

Learning Clang's subsystems and their roles

Driver

LLVM, assemblers, and linkers

Exploring Clang's tooling features and extension options

The FrontendAction class

Clang plugins

LibTooling and Clang Tools

Summary

Further reading

Chapter 6: Extending the Preprocessor

Technical requirements

Working with SourceLocation and SourceManager

Introducing SourceLocation

Introducing SourceManager

Learning preprocessor and lexer essentials

Understanding the role of the preprocessor and lexer in Clang

Understanding Token

Handling macros

Developing custom preprocessor plugins and callbacks

The project goal and preparation

Implementing a custom pragma handler

Implementing custom preprocessor callbacks

Summary

Exercises

Chapter 7: Handling AST

Technical requirements

Learning about AST in Clang

In-memory structure of Clang AST

Types in Clang AST

ASTMatcher

Writing AST plugins

Project overview

Printing diagnostic messages

Creating the AST plugin

Summary

Chapter 8: Working with Compiler Flags and Toolchains

Technical requirements

Understanding drivers and toolchains in Clang

Adding custom driver flags

Project overview

Declaring custom driver flags

Translating custom driver flags

Passing flags to the frontend

Adding a custom toolchain

Project overview

Creating the toolchain and adding a custom include path

Creating a custom assembling stage

Creating a custom linking stage

Verifying the custom toolchain

Summary

Exercises

Section 3: "Middle-End" Development

Chapter 9: Working with PassManager and AnalysisManager

Technical requirements

Writing an LLVM Pass for the new PassManager

Project overview

Writing the StrictOpt Pass

Working with the new AnalysisManager

Overview of the project

Writing the HaltAnalyzer Pass

Learning instrumentations in the new PassManager

Printing Pass pipeline details

Printing changes to the IR after each Pass

Bisecting the Pass pipeline

Summary

Questions

Chapter 10: Processing LLVM IR

Technical requirements

Learning LLVM IR basics

Iterating different IR units

Iterating instructions

Iterating basic blocks

Iterating the call graph

Learning about GraphTraits

Working with values and instructions

Understanding SSA

Working with values

Working with instructions

Working with loops

Learning about loop representation in LLVM

Learning about loop infrastructure in LLVM

Summary

Chapter 11: Gearing Up with Support Utilities

Technical requirements

Printing diagnostic messages

Collecting statistics

Using the Statistic class

Using an optimization remark

Adding time measurements

Using the Timer class

Collecting the time trace

Error-handling utilities in LLVM

Introducing the Error class

Learning about the Expected and ErrorOr classes

The Expected class

The ErrorOr class

Summary

Chapter 12: Learning LLVM IR Instrumentation

Technical requirements

Developing a sanitizer

An example of using an address sanitizer

Creating a loop counter sanitizer

Working with PGO

Introduction to instrumentation-based PGO

Introduction to sampling-based PGO

Using profiling data analyses

Summary

Assessments

Other Books You May Enjoy

Preface

A compiler is one of the most prevailing tools used by programmers. The majority of programmers have compilers – or some form of compilation technique – in their development flow. A modern compiler not only transforms high-level programming languages into low-level machine code, but also plays a key role in optimizing the speed, size, or even the memory footprint of the program it compiles. With these characteristics, building a production-ready compiler has always been a challenging task.

LLVM is a framework for compiler optimization and code generation. It provides building blocks that significantly reduce the efforts of developers to create high-quality optimizing compilers and programming language tools. One of its most famous products is Clang – the C-family language compiler that builds thousands of pieces of widely-used software including the Google Chrome browser and iOS apps. LLVM is also used in compilers for many different programming languages, such as the famous Swift programming language. It is not an exaggeration to say that LLVM is one of the hottest topics when it comes to creating a new programming language.

With hundreds of libraries and thousands of different APIs, LLVM provides a wide range of features, from key functionalities for optimizing a program to more general utilities. In this book, we provide a complete and thorough developer guide to two of the most important sub-systems in LLVM – Clang and the middle-end. We start with introductions to several components and development best practices that can benefit your general development experiences with LLVM. Then, we will show you how to develop with Clang. More specifically, we will focus on the topics that help you augment and customize the functionalities in Clang. In the last part of this book, you will learn crucial knowledge about LLVM IR development. This includes how to write an LLVM Pass with the latest syntax and mastering processing different IR constructions. We also show you several utilities that can greatly improve your productivity in LLVM development. Last but not least, we don't assume any particular LLVM version in this book – we try to keep up to date and include the latest features from the LLVM source tree.

This book provides a handful of code snippets and sample projects in every chapter. You are encouraged to download them from the GitHub repository of this book and play around with your own customizations.

Who this book is for

This book is for people of all LLVM experience levels, with a basic understanding of compilers. If you are a compiler engineer who uses LLVM in your daily work, this book provides concise development guidelines and references. If you are an academic researcher, this book will help you learn useful LLVM skills and build your prototypes and projects in a short time. Programming language enthusiasts will also find this book useful when it comes to building a new programming language with the help of LLVM.

What this book covers

Chapter 1, Saving Resources When Building LLVM, gives a brief introduction to the LLVM project, before showing you how to build LLVM without draining your CPU, memory resources, and disk space. This paves the road to shorter development cycles and smoother experiences for later chapters.

Chapter 2, Exploring LLVM's Build System Features, shows you how to write CMake build scripts for both in-tree and out-of-tree LLVM development. You will learn crucial skills to leverage LLVM's custom build system features to write more expressive and robust build scripts.

Chapter 3, Testing with LLVM LIT, shows you the way to run testing with LLVM's LIT infrastructure. The chapter not only gives you a better understanding of how testing works in LLVM's source tree but also enables you to integrate this intuitive, scalable testing infrastructure into any project.

Chapter 4, TableGen Development, shows you how to write TableGen – a special Domain Specific Language (DSL) invented by LLVM. We especially focus on using TableGen as a general tool for processing structural data, giving you flexible skills to use TableGen outside LLVM.

Chapter 5, Exploring Clang's Architecture, marks the start of our topics on Clang. This chapter gives you an overview of Clang, especially its compilation flow, and presents to you the role of individual components in Clang's compilation flow.

Chapter 6, Extending the Preprocessor, shows you the architecture of the preprocessor in Clang and, more importantly, shows you how to develop a plugin to extend its functionalities without modifying any code in the LLVM source tree.

Chapter 7, Handling AST, shows you how to develop with an Abstract Syntax Tree (AST) in Clang. The content includes learning important topics to work with an AST's in-memory representation and a tutorial to create a plugin that inserts custom AST processing logic into the compilation flow.

Chapter 8, Working with Compiler Flags and Toolchains, covers the steps to add custom compiler flags and toolchains to Clang. Both skills are especially crucial if you want to support new features or new platforms in Clang.

Chapter 9, Working with PassManager and AnalysisManager, marks the start of our discussion on the LLVM middle-end. This chapter focuses on writing an LLVM pass – using the latest new PassManager syntax – and how to access program analysis data via AnalysisManager.

Chapter 10, Processing LLVM IR, is a big chapter containing a variety of core knowledge regarding LLVM IR, including the structure of LLVM IR's in-memory representation and useful skills to work with different IR units such as functions, instructions, and loops.

Chapter 11, Gearing Up with Support Utilities, introduces some utilities that can improve your productivity – such as having better debugging experiences – when working with LLVM IR.

Chapter 12, Learning LLVM IR Instrumentation, shows you how instrumentation works on LLVM IR. It covers two primary use cases: Sanitizer and Profile-Guided Optimization (PGO). For the former, you will learn how to create a custom sanitizer. For the latter, you will learn how to leverage PGO data in your LLVM Pass.

To get the most out of this book

This book is designed to bring you the latest features of LLVM, so we encourage you to use LLVM after version 12.0, or even the development branch – that is, the main branch – throughout this book.

We assume that you are working on Linux or Unix systems (including macOS). Tools and sample commands in this book are mostly run in the command-line interface, but you are free to use any code editors or IDEs to write your code.

In Chapter 1, Saving Resources on Building LLVM, we will provide details on how to build LLVM from source.

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/LLVM-Techniques-Tips-and-Best-Practices-Clang-and-Middle-End-Libraries. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838824952_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "To include Clang in the build list, please edit the value assigned to the LLVM_ENABLE_PROJECTS CMake variable."

A block of code is set as follows:

TranslationUnitDecl 0x560f3929f5a8 <<invalid sloc>> <invalid sloc>

|…

`-FunctionDecl 0x560f392e1350 <./test.c:2:1, col:30> col:5 foo 'int (int)'

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

  |-ParmVarDecl 0x560f392e1280 <col:9, col:13> col:13 used c 'int'

  `-CompoundStmt 0x560f392e14c8 <col:16, col:30>

    `-ReturnStmt 0x560f392e14b8 <col:17, col:28>

      `-BinaryOperator 0x560f392e1498 <col:24, col:28> 'int' '+'

        |-ImplicitCastExpr 0x560f392e1480 <col:24> 'int' <LValueToRValue>

        | `-DeclRefExpr 0x560f392e1440 <col:24> 'int' lvalue ParmVar 0x560f392e1280 'c' 'int'

        `-IntegerLiteral 0x560f392e1460 <col:28> 'int' 1

Any command-line input or output is written as follows:

$ clang -fplugin=/path/to/MyPlugin.so … foo.cpp

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Build System and LLVM-Specific Tooling

You will learn the advanced skills of developing LLVM's build system for both in-tree and out-of-tree scenarios. This section includes the following chapters:

Chapter 1, Saving Resources When Building LLVMChapter 2, Exploring LLVM's Build System FeaturesChapter 3, Testing with LLVM LITChapter 4, TableGen Development

Chapter 2: Exploring LLVM's Build System Features

In the previous chapter, we saw that LLVM's build system is a behemoth: it contains hundreds of build files with thousands of interleaving build dependencies. Not to mention, it contains targets that require custom build instructions for heterogeneous source files. These complexities drove LLVM to adopt some advanced build system features and, more importantly, a more structural build system design. In this chapter, our goal will be to learn about some important directives for the sake of writing more concise and expressive build files when doing both in-tree and out-of-tree LLVM developments.

In this chapter, we will cover the following main topics:

Exploring a glossary of LLVM's important CMake directivesIntegrating LLVM via CMake in out-of-tree projects

Technical requirements

Similar to Chapter 1, Saving Resources When Building LLVM, you might want to have a copy of LLVM built from its source. Optionally, since this chapter will touch on quite a lot of CMake build files, you may wish to prepare a syntax highlighting plugin for CMakeLists.txt (for example, VSCode's CMake Tools plugin). All major IDEs and editors should have it off-the-shelf. Also, familiarity with basic CMakeLists.txt syntax is preferable.

All the code examples in this chapter can be found in this book's GitHub repository: https://github.com/PacktPublishing/LLVM-Techniques-Tips-and-Best-Practices/tree/main/Chapter02.

Exploring a glossary of LLVM's important CMake directives

LLVM has switched to CMake from GNU autoconf due to higher flexibility in terms of choosing underlying build systems. Ever since, LLVM has come up with many custom CMake functions, macros, and rules to optimize its own usage. This section will give you an overview of the most important and frequently used ones among them. We will learn how and when to use them.

Using the CMake function to add new libraries

Libraries are the building blocks of the LLVM framework. However, when writing CMakeLists.txt for a new library, you shouldn't use the normal add_library directive that appears in normal CMakeLists.txt files, as follows:

# In an in-tree CMakeLists.txt file…

add_library(MyLLVMPass SHARED

  MyPass.cpp) # Do NOT do this to add a new LLVM library

There are several drawbacks of using the vanilla add_library here, as follows:

As shown in Chapter 1, Saving Resources When Building LLVM, LLVM prefers to use a global CMake argument (that is, BUILD_SHARED_LIBS) to control whether all its component libraries should be built statically or dynamically. It's pretty hard to do that using the built-in directives. Similar to the previous point, LLVM prefers to use a global CMake arguments to control some compile flags, such as whether or not to enable Runtime Type Information (RTTI) and C++ exception handling in the code base.By using custom CMake functions/macros, LLVM can create its own component system, which provides a higher level of abstraction for developers to designate build target dependencies in an easier way.

Therefore, you should always use the add_llvm_component_library CMake function shown here:

# In a CMakeLists.txt

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

Here, LLVMFancyOpt is the final library name and FancyOpt.cpp is the source file.

In regular CMake scripts, you can use target_link_libraries to designate a given target's library dependencies, and then use add_dependencies to assign dependencies among different build targets to create explicit build orderings. There is an easier way to do those tasks when you're using LLVM's custom CMake functions to create library targets.

By using the LINK_COMPONENTS argument in add_llvm_component_library (or add_llvm_library, which is the underlying implementation of the former one), you can designate the target's linked components:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  LINK_COMPONENTS

  Analysis ScalarOpts)

Alternatively, you can do the same thing with the LLVM_LINK_COMPONENTS variable, which is defined before the function call:

set(LLVM_LINK_COMPONENTS

    Analysis ScalarOpts)

add_llvm_component_library(LLVMFancyOpt

   FancyOpt.cpp)

Component libraries are nothing but normal libraries with a special meaning when it comes to the LLVM building blocks you can use. They're also included in the gigantic libLLVM library if you choose to build it. The component names are slightly different from the real library names. If you need the mapping from component names to library names, you can use the following CMake function:

llvm_map_components_to_libnames(output_lib_names

  <list of component names>)

If you want to directly link against a normal library (the non-LLVM component one), you can use the LINK_LIBS argument:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  LINK_LIBS

  ${BOOST_LIBRARY})

To assign general build target dependencies to a library target (equivalent to add_dependencies), you can use the DEPENDS argument:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp

  DEPENDS

  intrinsics_gen)

intrinsics_gen is a common target representing the procedure of generating header files containing LLVM intrinsics definitions.

Adding one build target per folder

Many LLVM custom CMake functions have a pitfall that involves source file detection. Let's say you have a directory structure like this:

/FancyOpt

  |___ FancyOpt.cpp

  |___ AggressiveFancyOpt.cpp

  |___ CMakeLists.txt

Here, you have two source files, FancyOpt.cpp and AggressiveFancyOpt.cpp. As their names suggest, FancyOpt.cpp is the basic version of this optimization, while AggressiveFancyOpt.cpp is an alternative, more aggressive version of the same functionality. Naturally, you will want to split them into separate libraries so that users can choose if they wish to include the more aggressive one in their normal workload. So, you might write a CMakeLists.txt file like this:

# In /FancyOpt/CMakeLists.txt

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

add_llvm_component_library(LLVMAggressiveFancyOpt

  AggressiveFancyOpt.cpp)

Unfortunately, this would generate error messages telling you something to the effect of Found unknown source AggressiveFancyOpt.cpp … when processing the first add_llvm_component_library statement.

LLVM's build system enforces a stricter rule to make sure that all C/C++ source files in the same folder are added to the same library, executable, or plugin. To fix this, it is necessary to split either file into a separate folder, like so:

/FancyOpt

  |___ FancyOpt.cpp

  |___ CMakeLists.txt

  |___ /AggressiveFancyOpt

       |___ AggressiveFancyOpt.cpp

       |___ CMakeLists.txt

In /FancyOpt/CMakeLists.txt, we have the following:

add_llvm_component_library(LLVMFancyOpt

  FancyOpt.cpp)

add_subdirectory(AggressiveFancyOpt)

Finally, in /FancyOpt/AggressiveFancyOpt/CMakeLists.txt, we have the following:

add_llvm_component_library(LLVMAggressiveFancyOpt

  AggressiveFancyOpt.cpp)

These are the essentials of adding build targets for (component) libraries using LLVM's custom CMake directives. In the next two sections, we will show you how to add executable and Pass plugin build targets using a different set of LLVM-specific CMake directives.

Using the CMake function to add executables and tools

Similar to add_llvm_component_library, to add a new executable target, we can use add_llvm_executable or add_llvm_tool:

add_llvm_tool(myLittleTool

  MyLittleTool.cpp)

These two functions have the same syntax. However, only targets created by add_llvm_tool will be included in the installations. There is also a global CMake variable, LLVM_BUILD_TOOLS, that enables/disables those LLVM tool targets.

Both functions can also use the DEPENDS argument to assign dependencies, similar to add_llvm_library, which we introduced earlier. However, you can only use the LLVM_LINK_COMPONENTS variable to designate components to link.

Using the CMake function to add Pass plugins

While we will cover Pass plugin development later in this book, adding a build target for a Pass plugin couldn't be any easier than now (compared to earlier LLVM versions, which were still using add_llvm_library with some special arguments). We can simply use the following command:

add_llvm_pass_plugin(MyPass

   HelloWorldPass.cpp)

The LINK_COMPONENTS, LINK_LIBS, and DEPENDS arguments are also available here, with the same usages and functionalities as in add_llvm_component_library.

These are some of the most common and important LLVM-specific CMake directives. Using these directives can not only make your CMake code more concise but also help synchronize it with LLVM's own build system, in case you want to do some in-tree development. In the next section, we will show you how to integrate LLVM into an out-of-tree CMake project, and leverage the knowledge we learned in this chapter.

In-tree versus out-of-tree development

In this book, in-tree development means contributing code directly to the LLVM project, such as fixing LLVM bugs or adding new features to the existing LLVM libraries. Out-of-tree development, on the other hand, either represents creating extensions for LLVM (writing an LLVM pass, for example) or using LLVM libraries in some other projects (using LLVM's code generation libraries to implement your own programming language, for example).

Understanding CMake integration for out-of-tree projects

Implementing your features in an in-tree project is good for prototyping, since most of the infrastructure is already there. However, there are many scenarios where pulling the entire LLVM source tree into your code base is not the best idea, compared to creating an out-of-tree project and linking it against the LLVM libraries. For example, you only want to create a small code refactoring tool using LLVM's features and open source it on GitHub, so telling developers on GitHub to download a multi-gigabyte LLVM source tree along with your little tool might not be a pleasant experience.

There are at least two ways to configure out-of-tree projects to link against LLVM:

Using the llvm-config toolUsing LLVM's CMake modules

Both approaches help you sort out all the details, including header files and library paths. However, the latter creates more concise and readable CMake scripts, which is preferable for projects that are already using CMake. This section will show the essential steps of using LLVM's CMake modules to integrate it into an out-of-tree CMake project.

First, we need to prepare an out-of-tree (C/C++) CMake project. The core CMake functions/macros we discussed in the previous section will help us work our way through this. Let's look at our steps:

We are assuming that you already have the following CMakeLists.txt skeleton for a project that needs to be linked against LLVM libraries:

project(MagicCLITool)

set(SOURCE_FILES

    main.cpp)

add_executable(magic-cli

  ${SOURCE_FILES})

Regardless of whether you're trying to create a project generating executable, just like the one we saw in the preceding code block, or other artifacts such as libraries or even LLVM Pass plugins, the biggest question now is how to get include path, as well as library path.

To resolve include path and library path, LLVM provides the standard CMake package interface for you to use the find_package CMake directive to import various configurations, as follows:

project(MagicCLITool)

find_package(LLVM REQUIRED CONFIG)

include_directories(${LLVM_INCLUDE_DIRS})

link_directories(${LLVM_LIBRARY_DIRS})

To make the find_package trick work, you need to supply the LLVM_DIR CMake variable while invoking the CMake command for this project:

$ cmake -DLLVM_DIR=<LLVM install path>/lib/cmake/llvm …

Make sure it's pointing to the lib/cmake/llvm subdirectory under LLVM install path.

After resolving the include path and library, it's time to link the main executable against LLVM's libraries. LLVM's custom CMake functions (for example, add_llvm_executable) will be really useful here. But first, CMake needs to be able to find those functions.

The following snippet imports LLVM's CMake module (more specifically, the AddLLVM CMake module), which contains those LLVM-specific functions/macros that we introduced in the previous section:

find_package(LLVM REQUIRED CONFIG)

list(APPEND CMAKE_MODULE_PATH ${LLVM_CMAKE_DIR})

include(AddLLVM)

The following snippet adds the executable build target using the CMake function we learned about in the previous section:

find_package(LLVM REQUIRED CONFIG)

include(AddLLVM)

set(LLVM_LINK_COMPONENTS

  Support

  Analysis)

add_llvm_executable(magic-cli

  main.cpp)

Adding the library target makes no difference:

find_package(LLVM REQUIRED CONFIG)

include(AddLLVM)

add_llvm_library(MyMagicLibrary

  lib.cpp

  LINK_COMPONENTS

  Support Analysis)

Finally, add the LLVM Pass plugin:

find_package(LLVM REQUIRED CONFIG)

include(AddLLVM)

add_llvm_pass_plugin(MyMagicPass

  ThePass.cpp)

In practice, you also need to be careful of LLVM-specific definitions and the RTTI setting:

find_package(LLVM REQUIRED CONFIG)

add_definitions(${LLVM_DEFINITIONS})

if(NOT ${LLVM_ENABLE_RTTI})

  # For non-MSVC compilers

  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-rtti")

endif()

add_llvm_xxx(source.cpp)

This is especially true for the RTTI part because, by default, LLVM is not built with RTTI support, but normal C++ applications are. A compilation error will be thrown if there is an RTTI mismatch between your code and LLVM's libraries.

Despite the convenience of developing inside LLVM's source tree, sometimes, enclosing the entire LLVM source in your project might not be feasible. So, instead, we must create an out-of-tree project and integrate LLVM as a library. This section showed you how to integrate LLVM into your CMake-based out-of-tree projects and make good use of the LLVM-specific CMake directives we learned about in the Exploring a glossary of LLVM's important CMake directives section.

Summary

This chapter dug deeper into LLVM's CMake build system. We saw how to use LLVM's own CMake directives to write concise and effective build scripts, for both in-tree and out-of-tree development. Learning these CMake skills can make your LLVM development more efficient and provide you with more options to engage LLVM features with other existing code bases or custom logic.

In the next chapter, we will introduce another important infrastructure in the LLVM project known as the LLVM LIT, which is an easy-to-use yet general framework for running various kinds of tests.

Chapter 3: Testing with LLVM LIT

In the previous chapter, we learned how to take advantage of LLVM's own CMake utilities to improve our development experience. We also learned how to seamlessly integrate LLVM into other out-of-tree projects. In this chapter, we're going to talk about how to get hands-on with LLVM's own testing infrastructure, LIT.

LIT is a testing infrastructure that was originally developed for running LLVM's regression tests. Now, it's not only the harness for running all the tests in LLVM (both unit and regression tests)