LLVM Essentials - Mayur Pandey - E-Book

LLVM Essentials E-Book

Mayur Pandey

0,0
20,39 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

LLVM is currently the point of interest for many firms, and has a very active open source community. It provides us with a compiler infrastructure that can be used to write a compiler for a language. It provides us with a set of reusable libraries that can be used to optimize code, and a target-independent code generator to generate code for different backends. It also provides us with a lot of other utility tools that can be easily integrated into compiler projects.

This book details how you can use the LLVM compiler infrastructure libraries effectively, and will enable you to design your own custom compiler with LLVM in a snap.
We start with the basics, where you’ll get to know all about LLVM. We then cover how you can use LLVM library calls to emit intermediate representation (IR) of simple and complex high-level language paradigms. Moving on, we show you how to implement optimizations at different levels, write an optimization pass, generate code that is independent of a target, and then map the code generated to a backend. The book also walks you through CLANG, IR to IR transformations, advanced IR block transformations, and target machines.
By the end of this book, you’ll be able to easily utilize the LLVM libraries in your own projects.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 166

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

LLVM Essentials
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Playing with LLVM
Modular design and collection of libraries
Getting familiar with LLVM IR
LLVM tools and using them in the command line
Summary
2. Building LLVM IR
Creating an LLVM module
Emitting a function in a module
Adding a block to a function
Emitting a global variable
Emitting a return statement
Emitting function arguments
Emitting a simple arithmetic statement in a basic block
Emitting if-else condition IR
Emitting LLVM IR for loop
Summary
3. Advanced LLVM IR
Memory access operations
Getting the address of an element
Reading from the memory
Writing into a memory location
Inserting a scalar into a vector
Extracting a scalar from a vector
Summary
4. Basic IR Transformations
Opt Tool
Pass and Pass Manager
Using other Pass info in current Pass
AnalysisUsage::addRequired<> method
AnalysisUsage:addRequiredTransitive<> method
AnalysisUsage::addPreserved<> method
Instruction simplification example
Instruction Combining
Summary
5. Advanced IR Block Transformations
Loop processing
Scalar evolution
LLVM intrinsics
Vectorization
Summary
6. IR to Selection DAG phase
Converting IR to selectionDAG
Legalizing SelectionDAG
Optimizing SelectionDAG
Instruction Selection
Scheduling and emitting machine instructions
Register allocation
Code Emission
Summary
7. Generating Code for Target Architecture
Sample backend
Defining registers and register sets
Defining the calling convention
Defining the instruction set
Implementing frame lowering
Lowering instructions
Printing an instruction
Summary
Index

LLVM Essentials

LLVM Essentials

Copyright © 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: December 2015

Production reference: 1021215

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78528-080-1

www.packtpub.com

Credits

Authors

Suyog Sarda

Mayur Pandey

Reviewer

Renato Golin

Commissioning Editor

Nadeem Bagban

Acquisition Editor

Harsha Bharwani

Content Development Editor

Priyanka Mehta

Technical Editor

Ryan Kochery

Copy Editor

Imon Biswas

Project Coordinator

Izzat Contractor

Proofreader

Safis Editing

Indexer

Tejal Daruwale Soni

Production Coordinator

Aparna Bhagat

Cover Work

Aparna Bhagat

About the Authors

Suyog Sarda is a professional software engineer and an open source enthusiast. He focuses on compiler development and compiler tools. He is an active contributor to the LLVM open source community. Suyog was also involved in code performance improvements for the ARM and X86 architectures. He has been a part of the compiler team for the Tizen project. His interest in compiler development lies more in code optimization and vectorization.

Previously, he has authored a book on LLVM, titled LLVM Cookbook by Packt Publishing.

Apart from compilers, Suyog is also interested in Linux Kernel Development. He has published a technical paper titled Secure Co-resident Virtualization in Multicore Systems by VM Pinning and Page Coloring at the IEEE Proceedings of the 2012 International Conference on Cloud Computing, Technologies, Applications, and Management at the Birla Institute of Technology, Dubai. He has earned a bachelor's degree in computer technology from the College of Engineering, Pune, India.

I would like to thank my family and friends for encouraging me to write this book. I am thankful to my co-author and reviewers who did a tremendous job of refining the contents. I would also like to thank the LLVM open source community for always being helpful. It has been a great experience to interact with the community. It is amazing to see how fast LLVM has evolved.

Mayur Pandey is a professional software engineer and open source enthusiast focused on compiler development and tools. He is an active contributor to the LLVM open source community. He has been a part of the compiler team for the Tizen project and has hands-on experience of other proprietary compilers.

He has earned a bachelor's degree in Information Technology from Motilal Nehru National Institute of Technology, Allahabad, India. Currently, he lives in Bengaluru, India.

I would like to thank my family and friends who made it possible for me to complete the book by taking care of my other commitments, and who have always being encouraging.

About the Reviewer

Renato Golin has worked with toolchains since 2008, developing debuggers and compilers for multiple languages and platforms, and has also been LLVM Tech Lead at ARM and Linaro, focusing on code generation, correctness, performance, and providing a complete toolchain solution based on LLVM for the diverse ARM platforms.

Before that, he spent a decade moving between web back-ends, databases, distributed systems, big data and bioinformatics, always working on and with open source projects.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

LLVM is one of the very hot topics in recent times. It is an open source project with an ever-increasing number of contributors. Every programmer comes across a compiler at some point or the other while programming. Simply speaking, a compiler converts a high-level language to machine-executable code. However, what goes on under the hood is a lot of complex algorithms at work. So, to get started with compiler, LLVM will be the simplest infrastructure to study. Written in object-oriented C++, modular in design, and with concepts that are very easy to map to theory, LLVM proves to be attractive for experienced compiler programmers and for novice students who are willing to learn.

As authors, we maintain that simple solutions frequently work better and are easier to grasp than complex solutions. Throughout the book we will look at various topics that will help you enhance your skills and drive you to learn more.

We also believe that this book will be helpful for people not directly involved in compiler development as knowledge of compiler development will help them write code optimally.

What this book covers

Chapter 1, Playing with LLVM, introduces you to the modular design of LLVM and LLVM Intermediate Representation. In this chapter, we also look into some of the tools that LLVM provides.

Chapter 2, Building LLVM IR, introduces you to some basic function calls provided by the LLVM infrastructure to build LLVM IR. This chapter demonstrates building of modules, functions, basic blocks, condition statements, and loops using LLVM APIs.

Chapter 3, Advanced LLVM IR, introduces you to some advanced IR paradigms. This chapter explains advanced IR to the readers and shows how LLVM function calls can be used to emit them in the IR.

Chapter 4, Basic IR Transformations, deals with basic transformation optimizations at the IR level using the LLVM optimizer tool opt and the LLVM Pass infrastructure. You will learn how to use the information of one pass in another and then look into Instruction Simplification and Instruction Combining Passes.

Chapter 5, Advanced IR Block Transformations, deals with optimizations at block level on IR. We will discuss various optimizations such as Loop Optimizations, Scalar Evolution, Vectorization, and so on, followed by the summary of this chapter.

Chapter 6, IR to Selection DAG phase, takes you on a journey through the abstract infrastructure of a target-independent code generator. We explore how LLVM IR is converted to Selection DAG and various phases thereafter. It also introduces you to instruction selection, scheduling, register allocation, and so on.

Chapter 7, Generating Code for Target Architecture, introduces the readers to the tablegen concept. It shows how target architecture specifications such as register sets, instruction sets, calling conventions, and so on can be represented using tablegen, and how the output of tablegen can be used to emit code for a given architecture. This chapter can be used by readers as a reference for bootstrapping a target machine code generator.

What you need for this book

All you need to work through most of the examples covered in this book is a Linux machine, preferably Ubuntu. You will also need a simple text or code editor, Internet access, and a browser. We recommend installing the meld tool to compare two files; it works well on the Linux platform.

Who this book is for

This book is intended for those who already know some of the concepts concerning compilers and want to quickly become familiar with LLVM's infrastructure and the rich set of libraries that it provides. Compiler programmers, who are familiar with concepts of compilers and want to indulge in understanding, exploring, and using the LLVM infrastructure in a meaningful way in their work, will find this book useful.

This book is also for programmers who are not directly involved in compiler projects but are often involved in development phases where they write thousands of lines of code. With knowledge of how compilers work, they will be able to code in an optimal way and improve performance with clean code.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Playing with LLVM

The LLVM Compiler infrastructure project, started in 2000 in University of Illinois, was originally a research project to provide modern, SSA based compilation technique for arbitrary static and dynamic programming languages. Now it has grown to be an umbrella project with many sub projects within it, providing a set of reusable libraries having well defined interfaces.

LLVM is implemented in C++ and the main crux of it is the LLVM core libraries it provides. These libraries provide us with opt tool, the target independent optimizer, and code generation support for various target architectures. There are other tools which make use of core libraries, but our main focus in the book will be related to the three mentioned above. These are built around LLVM Intermediate Representation (LLVM IR), which can almost map all the high-level languages. So basically, to use LLVM's optimizer and code generation technique for code written in a certain programming language, all we need to do is write a frontend for a language that takes the high level language and generates LLVM IR. There are already many frontends available for languages such as C, C++, Go, Python, and so on. We will cover the following topics in this chapter:

Modular design and collection of librariesGetting familiar with LLVM IRLLVM Tools and using them at command line

Modular design and collection of libraries

The most important thing about LLVM is that it is designed as a collection of libraries. Let's understand these by taking the example of LLVM optimizer opt. There are many different optimization passes that the optimizer can run. Each of these passes is written as a C++ class derived from the Pass class of LLVM. Each of the written passes can be compiled into a .o file and subsequently they are archived into a .a library. This library will contain all the passes for opt tool. All the passes in this library are loosely coupled, that is they mention explicitly the dependencies on other passes.

When the optimizer is ran, the LLVM PassManager uses the explicitly mentioned dependency information and runs the passes in optimal way. The library based design allows the implementer to choose the order in which passes will execute and also choose which passes are to be executed based on the requirements. Only the passes that are required are linked to the final application, not the entire optimizer.

The following figure demonstrates how each pass can be linked to a specific object file within a specific library. In the following figure, the PassA references LLVMPasses.a for PassA.o, whereas the custom pass refers to a different library MyPasses.a for the MyPass.o object file.

The code generator also makes use of this modular design like the Optimizer, for splitting the code generation into individual passes, namely, instruction selection, register allocation, scheduling, code layout optimization, and assembly emission.

In each of the following phases mentioned there are some common things for almost every target, such as an algorithm for assigning physical registers available to virtual registers even though the set of registers for different targets vary. So, the compiler writer can modify each of the passes mentioned above and create custom target-specific passes. The use of the tablegen tool helps in achieving this using table description .td files for specific architectures. We will discuss how this happens later in the book.

Another capability that arises out of this is the ability to easily pinpoint a bug to a particular pass in the optimizer. A tool name Bugpointmakes use of this capability to automatically reduce the test case and pinpoint the pass that is causing the bug.

Getting familiar with LLVM IR

LLVM Intermediate Representation (IR) is the heart of the LLVM project. In general every compiler produces an intermediate representation on which it runs most of its optimizations. For a compiler targeting multiple-source languages and different architectures the important decision while selecting an IR is that it should neither be of very high-level, as in very closely attached to the source language, nor it should be very low-level, as in close to the target machine instructions. LLVM IR aims to be a universal IR of a kind, by being at a low enough level that high-level ideas may be cleanly mapped to it. Ideally the LLVM IR should have been target-independent, but it is not so because of the inherent target dependence in some of the programming languages itself. For example, when using standard C headers in a Linux system, the header files itself are target dependent, which may specify a particular type to an entity so that it matches the system calls of the particular target architecture.

Most of the LLVM tools revolve around this Intermediate Representation. The frontends of different languages generate this IR from the high-level source language. The optimizer tool of LLVM runs on this generated IR to optimize the code for better performance and the code generator makes use of this IR for target specific code generation. This IR has three equivalent forms:

An in-memory compiler IRAn on-disk bitcode representationA Human readable form (LLVM Assembly)