LLVM Cookbook - Mayur Pandey - E-Book

LLVM Cookbook E-Book

Mayur Pandey

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

The book is for compiler programmers who are familiar with concepts of compilers and want to indulge in understanding, exploring, and using LLVM infrastructure in a meaningful way in their work.
This book is also for programmers who are not directly involved in compiler projects but are often involved in development phases where they write thousands of lines of code. With knowledge of how compilers work, they will be able to code in an optimal way and improve performance with clean code.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 273

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

LLVM Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. LLVM Design and Use
Introduction
Understanding modular design
Getting ready
How to do it...
How it works...
There's more...
See also
Cross-compiling Clang/LLVM
Getting ready
How to do it...
How it works...
Converting a C source code to LLVM assembly
Getting ready
How to do it...
How it works...
See also
Converting IR to LLVM bitcode
Getting Ready
How to do it...
How it works...
There's more...
See also
Converting LLVM bitcode to target machine assembly
Getting ready
How to do it...
How it works...
There's more...
Converting LLVM bitcode back to LLVM assembly
Getting ready
How to do it...
How it works...
Transforming LLVM IR
Getting ready
How to do it...
How it works...
There's more...
Linking LLVM bitcode
Getting ready
How to do it...
How it works...
Executing LLVM bitcode
Getting ready
How to do it...
How it works...
See also
Using the C frontend Clang
Getting ready
How to do it…
How it works...
See also
Using the GO frontend
Getting ready
How to do it…
How it works…
See also
Using DragonEgg
Getting ready
How to do It…
See also
2. Steps in Writing a Frontend
Introduction
Defining a TOY language
How to do it…
Implementing a lexer
Getting ready
How to do it…
How it works…
See also
Defining Abstract Syntax Tree
Getting ready
How to do it…
How it works…
See also
Implementing a parser
Getting ready
How to do it…
How it works…
See also
Parsing simple expressions
Getting ready
How to do it…
How it works…
Parsing binary expressions
Getting ready
How to do it…
See also
Invoking a driver for parsing
How to do it…
How it works…
See also
Running lexer and parser on our TOY language
Getting ready
How to do it…
How it works…
See also
Defining IR code generation methods for each AST class
Getting ready
How to do it…
How it works…
Generating IR code for expressions
How to do it…
See also
Generating IR code for functions
How to do it…
How it works…
See also
Adding IR optimization support
How to do it…
See also
3. Extending the Frontend and Adding JIT Support
Introduction
Handling decision making paradigms – if/then/else constructs
Getting ready
How to do it...
How it works…
See also
Generating code for loops
Getting ready
How to do it...
How it works...
See also
Handling user-defined operators – binary operators
Getting ready
How to do it...
How it works...
See also
Handling user-defined operators – unary operators
Getting ready
How to do it...
How it works...
See also
Adding JIT support
How to do it...
How it works…
4. Preparing Optimizations
Introduction
Various levels of optimization
Getting ready...
How to do it…
How it works…
See Also
Writing your own LLVM pass
Getting ready
How to do it…
How it works
See also
Running your own pass with the opt tool
How to do it…
How it works…
See also
Using another pass in a new pass
Getting ready
How to do it…
How it works…
There's more…
Registering a pass with pass manager
Getting ready
How to do it…
How it works…
See Also
Writing an analysis pass
Getting ready
How to do it…
How it works…
Writing an alias analysis pass
Getting ready
How to do it...
How it works…
See also
Using other analysis passes
Getting ready…
How to do it…
How it works…
See also
5. Implementing Optimizations
Introduction
Writing a dead code elimination pass
Getting ready
How to do it…
How it works…
See also
Writing an inlining transformation pass
Getting ready
How to do it…
How it works...
Writing a pass for memory optimization
Getting ready
How to do it…
How it works…
See also
Combining LLVM IR
Getting started
How to do it…
How it works…
See also
Transforming and optimizing loops
Getting ready
How to do it…
How it works…
Reassociating expressions
Getting Ready
How to do it…
How it works …
Vectorizing IR
Getting ready
How to do it...
How it works…
See also…
Other optimization passes
Getting ready…
How to do it…
How it works…
See also
6. Target-independent Code Generator
Introduction
The life of an LLVM IR instruction
C Code to LLVM IR
IR optimization
LLVM IR to SelectionDAG
SelectionDAG legalization
Conversion from target-independent DAG to machine DAG
Scheduling instructions
Register allocation
Code emission
Visualizing LLVM IR CFG using GraphViz
Getting ready
How to do it…
See also
Describing targets using TableGen
Getting ready
How to do it
How it works
See also
Defining an instruction set
Getting ready
How to do it…
How it works…
See also
Adding a machine code descriptor
How it's done…
How it works…
Implementing the MachineInstrBuilder class
How to do it…
How it works…
Implementing the MachineBasicBlock class
How to do it…
How it works…
See also
Implementing the MachineFunction class
How to do it…
How it works…
See also
Writing an instruction selector
How to do it…
How it works…
Legalizing SelectionDAG
How to do it…
How it works…
Optimizing SelectionDAG
How to do it…
How it works…
See also
Selecting instruction from the DAG
How to do it…
How it works…
See also
Scheduling instructions in SelectionDAG
How to do it…
How it works…
See also
7. Optimizing the Machine Code
Introduction
Eliminating common subexpression from machine code
How to do it…
How it works…
See more
Analyzing live intervals
Getting ready
How to do it…
How it works…
See also
Allocating registers
Getting ready
How to do it…
How it works…
See also
Inserting the prologue-epilogue code
How to do it…
How it works…
Code emission
How to do it…
Tail call optimization
Getting ready
How to do it…
How it works…
Sibling call optimisation
Getting ready
How to do it…
How it works…
8. Writing an LLVM Backend
Introduction
A sample backend
Defining registers and registers sets
Getting ready
How to do it…
How it works…
See also
Defining the calling convention
How to do it…
How it works…
See also
Defining the instruction set
How to do it…
How it works…
See also
Implementing frame lowering
Getting ready
How to do it…
How it works…
See also
Printing an instruction
Getting ready
How to do it…
How it works…
Selecting an instruction
Getting ready
How to do it…
How it works…
See also
Adding instruction encoding
How to do it…
How it works…
See also
Supporting a subtarget
How to do it…
See also
Lowering to multiple instructions
How to do it…
How it works…
See also
Registering a target
How to do it…
How it works…
See also
9. Using LLVM for Various Useful Projects
Introduction
Exception handling in LLVM
Getting ready...
How to do it…
How it works…
See also
Using sanitizers
Getting ready
How to do it…
How it works…
See also…
Writing the garbage collector with LLVM
Getting ready
How to do it…
How it works…
See also
Converting LLVM IR to JavaScript
Getting ready
How to do it…
See more
Using the Clang Static Analyzer
Getting ready
How to do it…
How it works…
See also
Using bugpoint
Getting ready
How to do it…
How it works…
See also
Using LLDB
Getting ready
How to do it…
See also
Using LLVM utility passes
Getting ready
How to do it...
See also
Index

LLVM Cookbook

LLVM Cookbook

Copyright © 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: May 2015

Production reference: 1270515

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78528-598-1

www.packtpub.com

Credits

Authors

Mayur Pandey

Suyog Sarda

Reviewers

Logan Chien

Michael Haidl

Dave (Jing) Tian

Commissioning Editor

Nadeem N. Bagban

Acquisition Editor

Vivek Anantharaman

Content Development Editor

Shweta Pant

Technical Editors

Prajakta Mhatre

Rohith Rajan

Rupali Shrawane

Copy Editors

Vikrant Phadke

Sameen Siddiqui

Project Coordinator

Shipra Chawhan

Proofreader

Stephen Copestake

Safis Editing

Indexer

Tejal Soni

Graphics

Disha Haria

Production Coordinator

Melwyn D'sa

Cover Work

Melwyn D'sa

About the Authors

Mayur Pandey is a professional software engineer and an open source enthusiast. He focuses on compiler development and compiler tools. He is an active contributor to the LLVM open source community. He has been part of the compiler team for the Tizen project, and has hands-on experience with other proprietary compilers.

Mayur earned a bachelor's degree in information technology from Motilal Nehru National Institute of Technology Allahabad, India. Currently, he lives in Bengaluru, India.

I would like to thank my family and friends. They made it possible for me to complete the book by taking care of my other commitments and always encouraging me.

Suyog Sarda is a professional software engineer and an open source enthusiast. He focuses on compiler development and compiler tools. He is an active contributor to the LLVM open source community. He has been part of the compiler team for the Tizen project. Suyog was also involved in code performance improvements for the ARM and the x86 architecture. He has hands-on experience in other proprietary compilers. His interest in compiler development lies more in code optimization and vectorization.

Apart from compilers, Suyog is also interested in Linux kernel development. He has published a technical paper titled Secure Co-resident Virtualization in Multicore Systems by VM Pinning and Page Coloring at the IEEE Proceedings of the 2012 International Conference on Cloud Computing, Technologies, Applications, and Management at Birla Institute of Technology, Dubai. He earned a bachelor's degree in computer technology from College of Engineering, Pune, India. Currently, he lives in Bengaluru, India.

I would like to thank my family and friends. I would also like to thank the LLVM open-source community for always being helpful.

About the Reviewers

Logan Chien received his master's degree in computer science from National Taiwan University. His research interests include compiler design, compiler optimization, and virtual machines. He is a full-time software engineer. In his free time, he works on several open source projects, such as LLVM and Android. Logan has participated in the LLVM project since 2012.

Michael Haidl is a high performance computing engineer with focus on many core architectures that consist of Graphics Processing Units (GPUs) and Intel Xeon Phi accelerators. He has been a C++ developer for more than 14 years, and has gained many skills in parallel programming, exploiting various programming models (CUDA) over the years. He has a diploma in computer science and physics. Currently, Michael is employed as a research associate at the University of Münster, Germany, and is writing his PhD thesis with focus on compilation techniques for GPUs utilizing the LLVM infrastructure.

I would like to thank my wife for supporting me every day with her smiles and love. I would also like to thank the entire LLVM community for all the hard work they have put into LLVM/Clang and other LLVM projects. It is amazing to see how fast LLVM evolves.

Dave (Jing) Tian is a graduate research fellow and PhD student in the Department of Computer & Information Science & Engineering (CISE) at the University of Florida. He is a founding member of the SENSEI center. His research direction involves system security, embedded system security, trusted computing, static code analysis for security, and virtualization. He is interested in Linux kernel hacking and compiler hacking.

Dave spent a year on AI and machine learning, and taught Python and operating systems at the University of Oregon. Before that, he worked as a software developer in the LCP (Linux control platform) group in research and development at Alcatel-Lucent (formerly Lucent Technologies), for approximately 4 years. He holds a bachelor's degree in science and a master's degree in electronics engineering in China. You can reach him at <[email protected]> and visit his website http://davejingtian.org.

I would like to thank the author of this book, who has done a good job. Thanks to the editors of the book at Packt Publishing, who made this book perfect and offered me the opportunity to review such a nice book.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why Subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free Access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

A programmer might have come across compilers at some or the other point when programming. Simply speaking, a compiler converts a human-readable, high-level language into machine-executable code. But have you ever wondered what goes on under the hood? A compiler does lot of processing before emitting optimized machine code. Lots of complex algorithms are involved in writing a good compiler.

This book travels through all the phases of compilation: frontend processing, code optimization, code emission, and so on. And to make this journey easy, LLVM is the simplest compiler infrastructure to study. It's a modular, layered compiler infrastructure where every phase is dished out as a separate recipe. Written in object-oriented C++, LLVM gives programmers a simple interface and lots of APIs to write their own compiler.

As authors, we maintain that simple solutions frequently work better than complex solutions; throughout this book, we'll look at a variety of recipes that will help develop your skills, make you consider all the compiling options, and understand that there is more to simply compiling code than meets the eye.

We also believe that programmers who are not involved in compiler development will benefit from this book, as knowledge of compiler implementation will help them code optimally next time they write code.

We hope you will find the recipes in this book delicious, and after tasting all the recipes, you will be able to prepare your own dish of compilers. Feeling hungry? Let's jump into the recipes!

What this book covers

Chapter 1, LLVM Design and Use, introduces the modular world of LLVM infrastructure, where you learn how to download and install LLVM and Clang. In this chapter, we play with some examples to get accustomed to the workings of LLVM. We also see some examples of various frontends.

Chapter 2, Steps in Writing a Frontend, explains the steps to write a frontend for a language. We will write a bare-metal toy compiler frontend for a basic toy language. We will also see how a frontend language can be converted into the LLVM intermediate representation (IR).

Chapter 3, Extending the Frontend and Adding JIT Support, explores the more advanced features of the toy language and the addition of JIT support to the frontend. We implement some powerful features of a language that are found in most modern programming languages.

Chapter 4, Preparing Optimizations, takes a look at the pass infrastructure of the LLVM IR. We explore various optimization levels, and the optimization techniques kicking at each level. We also see a step-by-step approach to writing our own LLVM pass.

Chapter 5, Implementing Optimizations, demonstrates how we can implement various common optimization passes on LLVM IR. We also explore some vectorization techniques that are not yet present in the LLVM open source code.

Chapter 6, Target-independent Code Generator, takes us on a journey through the abstract infrastructure of a target-independent code generator. We explore how LLVM IR is converted to Selection DAGs, which are further processed to emit target machine code.

Chapter 7, Optimizing the Machine Code, examines how Selection DAGs are optimized and how target registers are allocated to variables. This chapter also describes various optimization techniques on Selection DAGs as well as various register allocation techniques.

Chapter 8, Writing an LLVM Backend, takes us on a journey of describing a target architecture. This chapter covers how to describe registers, instruction sets, calling conventions, encoding, subtarget features, and so on.

Chapter 9, Using LLVM for Various Useful Projects, explores various other projects where LLVM IR infrastructure can be used. Remember that LLVM is not just a compiler; it is a compiler infrastructure. This chapter explores various projects that can be applied to a code snippet to get useful information from it.

What you need for this book

All you need to work through most of the examples covered in this book is a Linux machine, preferably Ubuntu. You will also need a simple text or code editor, Internet access, and a browser. We recommend installing the meld tool for comparison of two files; it works well on the Linux platform.

Who this book is for

The book is for compiler programmers who are familiar with concepts of compilers and want to indulge in understanding, exploring, and using LLVM infrastructure in a meaningful way in their work.

This book is also for programmers who are not directly involved in compiler projects but are often involved in development phases where they write thousands of lines of code. With knowledge of how compilers work, they will be able to code in an optimal way and improve performance with clean code.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it, How it works, There's more, and See also).

To give clear instructions on how to complete a recipe, we use these sections.

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make you more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "We can include other contexts through the use of the include directive."

A block of code is set as follows:

primary := identifier_expr :=numeric_expr :=paran_expr

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

primary := identifier_expr :=numeric_expr :=paran_expr

Any command-line input or output is written as follows:

$ cat testfile.ll

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Clicking on the Next button moves you to the next screen."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/5981OS_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. LLVM Design and Use

In this chapter, we will cover the following topics:

Understanding modular designCross-compiling Clang/LLVMConverting a C source code to LLVM assemblyConverting IR to LLVM bitcodeConverting LLVM bitcode to target machine assemblyConverting LLVM bitcode back to LLVM assemblyTransforming LLVM IRLinking LLVM bitcodeExecuting LLVM bitcodeUsing C frontend ClangUsing the GO frontendUsing DragonEgg

Introduction

In this recipe, you get to know about LLVM, its design, and how we can make multiple uses out of the various tools it provides. You will also look into how you can transform a simple C code to the LLVM intermediate representation and how you can transform it into various forms. You will also learn how the code is organized within the LLVM source tree and how can you use it to write a compiler on your own later.

Cross-compiling Clang/LLVM

By cross-compiling we mean building a binary on one platform (for example, x86) that will be run on another platform (for example, ARM). The machine on which we build the binary is called the host, and the machine on which the generated binary will run is called the target. The compiler that builds code for the same platform on which it is running (the host and target platforms are the same) is called a native assembler, whereas the compiler that builds code for a target platform different from the host platform is called across-compiler.

In this recipe, cross-compilation of LLVM for a platform different than the host platform will be shown, so that you can use the built binaries for the required target platform. Here, cross-compiling will be shown using an example where cross-compilation from host platform x86_64 for target platform ARM will be done. The binaries thus generated can be used on a platform with ARM architecture.

Getting ready

The following packages need to be installed on your system (host platform):

cmakeninja-build (from backports in Ubuntu)gcc-4.x-arm-linux-gnueabihfgcc-4.x-multilib-arm-linux-gnueabihfbinutils-arm-linux-gnueabihflibgcc1-armhf-crosslibsfgcc1-armhf-crosslibstdc++6-armhf-crosslibstdc++6-4.x-dev-armhf-crossinstall llvm on your host platform

How to do it...

To compile for the ARM target from the host architecture, that is X86_64 here, you need to perform the following steps:

Add the following cmake flags to the normal cmake build for LLVM:
-DCMAKE_CROSSCOMPILING=True-DCMAKE_INSTALL_PREFIX= path-where-you-want-the-toolchain(optional)-DLLVM_TABLEGEN=<path-to-host-installed-llvm-toolchain-bin>/llvm-tblgen-DCLANG_TABLEGEN=< path-to-host-installed-llvm-toolchain-bin >/clang-tblgen-DLLVM_DEFAULT_TARGET_TRIPLE=arm-linux-gnueabihf-DLLVM_TARGET_ARCH=ARM-DLLVM_TARGETS_TO_BUILD=ARM-DCMAKE_CXX_FLAGS='-target armv7a-linux-gnueabihf -mcpu=cortex-a9 -I/usr/arm-linux-gnueabihf/include/c++/4.x.x/arm-linux-gnueabihf/ -I/usr/arm-linux-gnueabihf/include/ -mfloat-abi=hard -ccc-gcc-name arm-linux-gnueabihf-gcc'
If using your platform compiler, run:
$ cmake -G Ninja <llvm-source-dir> <options above>

If using Clang as the cross-compiler, run:

$ CC='clang' CXX='clang++' cmake -G Ninja <source-dir> <options above>

If you have clang/Clang++ on the path, it should work fine.

To build LLVM, simply type:
$ ninja
After the LLVM/Clang has built successfully, install it with the following command:
$ ninja install

This will create a sysroot on the install-dir location if you have specified the DCMAKE_INSTALL_PREFIX options

How it works...

The cmake package builds the toolchain for the required platform by making the use of option flags passed to cmake, and the tblgen