Learning Linux Binary Analysis - Ryan "elfmaster" O'Neill - E-Book

Learning Linux Binary Analysis E-Book

Ryan "elfmaster" O'Neill

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Uncover the secrets of Linux binary analysis with this handy guide

About This Book

  • Grasp the intricacies of the ELF binary format of UNIX and Linux
  • Design tools for reverse engineering and binary forensic analysis
  • Insights into UNIX and Linux memory infections, ELF viruses, and binary protection schemes

Who This Book Is For

If you are a software engineer or reverse engineer and want to learn more about Linux binary analysis, this book will provide you with all you need to implement solutions for binary analysis in areas of security, forensics, and antivirus. This book is great for both security enthusiasts and system level engineers. Some experience with the C programming language and the Linux command line is assumed.

What You Will Learn

  • Explore the internal workings of the ELF binary format
  • Discover techniques for UNIX Virus infection and analysis
  • Work with binary hardening and software anti-tamper methods
  • Patch executables and process memory
  • Bypass anti-debugging measures used in malware
  • Perform advanced forensic analysis of binaries
  • Design ELF-related tools in the C language
  • Learn to operate on memory with ptrace

In Detail

Learning Linux Binary Analysis is packed with knowledge and code that will teach you the inner workings of the ELF format, and the methods used by hackers and security analysts for virus analysis, binary patching, software protection and more.

This book will start by taking you through UNIX/Linux object utilities, and will move on to teaching you all about the ELF specimen. You will learn about process tracing, and will explore the different types of Linux and UNIX viruses, and how you can make use of ELF Virus Technology to deal with them.

The latter half of the book discusses the usage of Kprobe instrumentation for kernel hacking, code patching, and debugging. You will discover how to detect and disinfect kernel-mode rootkits, and move on to analyze static code. Finally, you will be walked through complex userspace memory infection analysis.

This book will lead you into territory that is uncharted even by some experts; right into the world of the computer hacker.

Style and approach

The material in this book provides detailed insight into the arcane arts of hacking, coding, reverse engineering Linux executables, and dissecting process memory. In the computer security industry these skills are priceless, and scarce. The tutorials are filled with knowledge gained through first hand experience, and are complemented with frequent examples including source code.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 368

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Learning Linux Binary Analysis
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. The Linux Environment and Its Tools
Linux tools
GDB
Objdump from GNU binutils
Objcopy from GNU binutils
strace
ltrace
Basic ltrace command
ftrace
readelf
ERESI – The ELF reverse engineering system interface
Useful devices and files
/proc/<pid>/maps
/proc/kcore
/boot/System.map
/proc/kallsyms
/proc/iomem
ECFS
Linker-related environment points
The LD_PRELOAD environment variable
The LD_SHOW_AUXV environment variable
Linker scripts
Summary
2. The ELF Binary Format
ELF file types
ELF program headers
PT_LOAD
PT_DYNAMIC – Phdr for the dynamic segment
PT_NOTE
PT_INTERP
PT_PHDR
ELF section headers
The .text section
The .rodata section
The .plt section
The .data section
The .bss section
The .got.plt section
The .dynsym section
The .dynstr section
The .rel.* section
The .hash section
The .symtab section
The .strtab section
The .shstrtab section
The .ctors and .dtors sections
ELF symbols
st_name
st_value
st_size
st_other
st_shndx
st_info
Symbol types
Symbol bindings
ELF relocations
Relocatable code injection-based binary patching
ELF dynamic linking
The auxiliary vector
Learning about the PLT/GOT
The dynamic segment revisited
DT_NEEDED
DT_SYMTAB
DT_HASH
DT_STRTAB
DT_PLTGOT
Coding an ELF Parser
Summary
3. Linux Process Tracing
The importance of ptrace
ptrace requests
ptrace request types
The process register state and flags
A simple ptrace-based debugger
Using the tracer program
A simple ptrace debugger with process attach capabilities
Advanced function-tracing software
ptrace and forensic analysis
What to look for in the memory
Process image reconstruction – from the memory to the executable
Challenges for process-executable reconstruction
Challenges for executable reconstruction
PLT/GOT integrity
Adding a section header table
The algorithm for the process
Process reconstruction with Quenya on a 32-bit test environment
Code injection with ptrace
Simple examples aren't always so trivial
Demonstrating the code_inject tool
A ptrace anti-debugging trick
Is your program being traced?
Summary
4. ELF Virus Technology – Linux/Unix Viruses
ELF virus technology
ELF virus engineering challenges
Parasite code must be self-contained
Solution
Complications with string storage
Solution
Finding legitimate space to store parasite code
Solution
Passing the execution control flow to the parasite
Solution
ELF virus parasite infection methods
The Silvio padding infection method
Algorithm for the Silvio .text infection method
An example of text segment padding infection
Adjusting the ELF headers
Inserting the parasite code
Example of using the functions above
The LPV virus
Use cases for the Silvio padding infection
The reverse text infection
Algorithm for reverse text infection
Data segment infections
Algorithm for data segment infection
The PT_NOTE to PT_LOAD conversion infection method
Algorithm for PT_NOTE to PT_LOAD conversion infections
Infecting control flow
Direct PLT infection
Function trampolines
Overwriting the .ctors/.dtors function pointers
GOT – global offset table poisoning or PLT/GOT redirection
Infecting data structures
Function pointer overwrites
Process memory viruses and rootkits – remote code injection techniques
Shared library injection – .so injection/ET_DYN injection
.so injection with LD_PRELOAD
Illustration 4.7 – using LD_PRELOAD to inject wicked.so.1
.so injection with open()/mmap() shellcode
.so injection with dlopen() shellcode
Illustration 4.8 – C code invoking __libc_dlopen_mode()
.so injection with VDSO manipulation
Text segment code injections
Executable injections
Relocatable code injection – the ET_REL injection
ELF anti-debugging and packing techniques
The PTRACE_TRACEME technique
Illustration 4.9 – an anti-debug with PTRACE_TRACEME example
The SIGTRAP handler technique
The /proc/self/status technique
The code obfuscation technique
The string table transformation technique
ELF virus detection and disinfection
Summary
5. Linux Binary Protection
ELF binary packers – dumb protectors
Stub mechanics and the userland exec
An example of a protector
Other jobs performed by protector stubs
Existing ELF binary protectors
DacryFile by the Grugq – 2001
Burneye by Scut – 2002
Shiva by Neil Mehta and Shawn Clowes – 2003
Maya's Veil by Ryan O'Neill – 2014
Maya's protection layers
Layer 1
Layer 2
Layer 3
Maya's nanomites
Maya's anti-exploitation
Source code of vuln.c
Example of exploiting vuln.c
Downloading Maya-protected binaries
Anti-debugging for binary protection
Resistance to emulation
Detecting emulation through syscall testing
Detecting emulated CPU inconsistencies
Checking timing delays between certain instructions
Obfuscation methods
Protecting control flow integrity
Attacks based on ptrace
Security vulnerability-based attacks
Other resources
Summary
6. ELF Binary Forensics in Linux
The science of detecting entry point modification
Detecting other forms of control flow hijacking
Patching the .ctors/.init_array section
Detecting PLT/GOT hooks
Truncated output from readelf -S command
Detecting function trampolines
Identifying parasite code characteristics
Checking the dynamic segment for DLL injection traces
Identifying reverse text padding infections
Identifying text segment padding infections
Identifying protected binaries
Analyzing a protected binary
IDA Pro
Summary
7. Process Memory Forensics
What does a process look like?
Executable memory mappings
The program heap
Shared library mappings
The stack, vdso, and vsyscall
Process memory infection
Process infection tools
Process infection techniques
Injection methods
Techniques for hijacking execution
Detecting the ET_DYN injection
Azazel userland rootkit detection
Mapping out the process address space
Finding LD_PRELOAD on the stack
Detecting PLT/GOT hooks
Identifying incorrect GOT addresses
ET_DYN injection internals
Example – finding the symbol for __libc_dlopen_mode
Code example – the __libc_dlopen_mode shellcode
Code example – libc symbol resolution
Code example – the x86_32 shellcode to mmap() an ET_DYN object
Manipulating VDSO to perform dirty work
Shared object loading – legitimate or not?
Legitimate shared object loading
Illegitimate shared object loading
Heuristics for .so injection detection
Tools for detecting PLT/GOT hooks
Linux ELF core files
Analysis of the core file – the Azazel rootkit
Starting up an Azazel infected process and getting a core dump
Core file program headers
The PT_NOTE segment
PT_LOAD segments and the downfalls of core files for forensics purposes
Using a core file with GDB for forensics
Summary
8. ECFS – Extended Core File Snapshot Technology
History
The ECFS philosophy
Getting started with ECFS
Plugging ECFS into the core handler
ECFS snapshots without killing the process
libecfs – a library for parsing ECFS files
readecfs
Examining an infected process using ECFS
Infecting the host process
Capturing and analyzing an ECFS snapshot
The symbol table analysis
The section header analysis
Extracting parasite code with readecfs
Analyzing the Azazel userland rootkit
The symbol table of the host2 process reconstructed
The section header table of the host2 process reconstructed
Validating the PLT/GOT with ECFS
The readecfs output for PLT/GOT validation
The ECFS reference guide
ECFS symbol table reconstruction
ECFS section headers
Using an ECFS file as a regular core file
The libecfs API and how to use it
Process necromancy with ECFS
Learning more about ECFS
Summary
9. Linux /proc/kcore Analysis
Linux kernel forensics and rootkits
stock vmlinux has no symbols
Building a proper vmlinux with kdress
/proc/kcore and GDB exploration
An example of navigating sys_call_table
Direct sys_call_table modifications
Detecting sys_call_table modifications
An example of validating the integrity of a syscall
Kernel function trampolines
Example of function trampolines
An example code for hijacking sys_write on a 32-bit kernel
Detecting function trampolines
An example with the ret instruction
An example with indirect jmp
An example with relative jmp
Interrupt handler patching – int 0x80, syscall
Detecting interrupt handler patching
Kprobe rootkits
Detecting kprobe rootkits
Debug register rootkits – DRR
Detecting DRR
VFS layer rootkits
Detecting VFS layer rootkits
An example of validating a VFS function pointer
Other kernel infection techniques
vmlinux and .altinstructions patching
.altinstructions and .altinstr_replace
From arch/x86/include/asm/alternative.h
Using textify to verify kernel code integrity
An example of using textify to check sys_call_table
Using taskverse to see hidden processes
Taskverse techniques
Infected LKMs – kernel drivers
Method 1 for infecting LKM files – symbol hijacking
Method 2 for infecting LKM files (function hijacking)
Detecting infected LKMs
Notes on /dev/kmem and /dev/mem
/dev/mem
FreeBSD /dev/kmem
K-ecfs – kernel ECFS
A sneak peek of the kernel-ecfs file
Kernel hacking goodies
General reverse engineering and debugging
Advanced kernel hacking/debugging interfaces
Papers mentioned in this chapter
Summary
Index

Learning Linux Binary Analysis

Learning Linux Binary Analysis

Copyright © 2016 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: February 2016

Production reference: 1250216

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78216-710-5

www.packtpub.com

Cover image by Lorne Schell (<[email protected]>)

Credits

Author

Ryan "elfmaster" O'Neill

Reviewers

Lubomir Rintel

Kumar Sumeet

Heron Yang

Content Development Editor

Sanjeet Rao

Technical Editor

Mohita Vyas

Copy Editor

Vikrant Phadke

Project Coordinator

Judie Jose

Proofreader

Safis Editing

Indexer

Tejal Daruwale Soni

Graphics

Jason Monteiro

Production Coordinator

Aparna Bhagat

Cover Work

Aparna Bhagat

About the Author

Ryan "elfmaster" O'Neill is a computer security researcher and software engineer with a background in reverse engineering, software exploitation, security defense, and forensics technologies. He grew up in the computer hacker subculture, the world of EFnet, BBS systems, and remote buffer overflows on systems with an executable stack. He was introduced to system security, exploitation, and virus writing at a young age. His great passion for computer hacking has evolved into a love for software development and professional security research. Ryan has spoken at various computer security conferences, including DEFCON and RuxCon, and also conducts a 2-day ELF binary hacking workshop.

He has an extremely fulfilling career and has worked at great companies such as Pikewerks, Leviathan Security Group, and more recently Backtrace as a software engineer.

Ryan has not published any other books, but he is well known for some of his papers published in online journals such as Phrack and VXHeaven. Many of his other publications can be found on his website at http://www.bitlackeys.org.

Acknowledgments

First and foremost, I would like to present a very genuine thank you to my mother, Michelle, to whom I have dedicated this book. It all started with her buying me my first computer, followed by a plethora of books, ranging from Unix programming to kernel internals and network security. At one point in my life, I thought I was done with computers forever, but about 5 years later, when I wanted to reignite my passion, I realized that I had thrown my books away! I then found that my mother had secretly saved them for me, waiting for the day I would return to them. Thank you mom, you are wonderful, and I love you.

I would also be very remiss not to acknowledge the most important woman in my life today, who is my twin flame and mother of two of my children. There is no doubt that I would not be where I am in my life and career without you. They say that behind every great man is an even greater woman. This old adage is very true. Thank you Marilyn for bringing immense joy and adventure into my life. I love you.

My father, Brian O'Neill, is a huge inspiration in my life and has taught me so many things about being a man, a father, and a friend. I love you Dad and I will always cherish our philosophical and spiritual connection.

Michael and Jade, thank you both for being such unique and wonderful souls. I love you both.

Lastly, I thank all three of my children: Mick, Jayden, and Jolene. One day, perhaps, you will read this book and know that your old man knows a thing or two about computers, but also that I will always put you guys first in my life. You are all three amazing beings and have imbued my life with such deep meaning and love.

Silvio Cesare is a legendary name in the computer security industry due to his highly innovative and groundbreaking research into many areas, beginning with ELF viruses, and breakthroughs in kernel vulnerability analysis. Thank you Silvio for your mentoring and friendship. I have learned more from you than from any other person in our industry.

Baron Oldenburg was an instrumental part of this book. On several occasions, I nearly gave up due to the time and energy drained, but Baron offered to help with the initial editing and putting the text into the proper format. This took a huge burden off the development process and made this book possible. Thank you Baron! You are a true friend.

Lorne Schell is a true Renaissance man—software engineer, musician, and artist. He was the brilliant hand behind the artwork on the cover of this book. How amazingly well does a Vitruvian Elf fit the description of this book artistically? Thank you Lorne. I am very grateful for your talent and the time you spent on this.

Chad Thunberg, my boss at Leviathan Security Group, was instrumental in making sure that I got the resources and the encouragement necessary to complete this book. Thank you.

All the guys at #bitlackeys on EFnet have my gratitude for their friendship and support.

About the Reviewers

Lubomir Rintel is a systems programmer based in Brno, Czech Republic. He's a full-time software developer currently working on Linux networking tools. Other than this, he has a history of contributions to many projects, including the Linux kernel and Fedora distribution. After years of being active in the free software community, he can appreciate a good book that covers the subject in a context wider than a manual would. He believes that this is such a book and hopes you enjoy it as much as he did. Also, he likes anteaters.

As of November 2015, Kumar Sumeet has over 4 years of research experience in IT security, during which he has produced a frontier of hacking and spy tools. He holds an MSc in information security from Royal Holloway, University of London. His recent focus area is machine learning techniques for detecting cyber anomalies and to counter threats.

Sumeet currently works as a security consultant for Riversafe, which is a London-based network security and IT data management consultancy firm. Riversafe specializes in some cutting-edge security technologies is also a Splunk Professional Services partner of the year 2015 in the EMEA region. They have completed many large-scale projects and engagements in multiple sectors, including telecommunications, banking and financial markets, energy, and airport authorities.

Sumeet is also a technical reviewer of the book Penetration Testing Using Raspberry Pi, Packt Publishing.

For more information or details about his projects and researches, you can visit his website at https://krsumeet.com or scan this QR code:

Sumeet can also be contacted via e-mail at <[email protected]>.

Heron Yang has always been working on creating something people really want. This firm belief of his was first established in high school. Then he continued his journey at National Chiao Tung University and Carnegie Mellon University, where he focused on Computer Science studies. As he cares about building connections between people and fulfilling user needs, he devoted himself to developing prototypes of start-up ideas, new applications or websites, study notes, books, and blogs in the past few years.

Thanks Packt for offering me this opportunity to get involved in the book publishing process, and thanks Judie Jose for helping a lot throughout the period. Moreover, thanks to all the challenges I've gone through to become a better person. This book goes into the details of binary reversing and will be great material for those who care about underlying mechanisms. Feel free to contact me for a discussion or just say "Hi" at <[email protected]> or http://heron.me.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Preface

Software engineering is the act of creating an invention that exists, lives, and breathes on a microprocessor. We call it a program. Reverse engineering is the act of discovering how exactly that program lives and breathes, and furthermore it is how we can understand, dissect, or modify the behavior of that program using a combination of disassemblers and reversing tools and relying on our hacker instincts to master the target program which we are reverse engineering. We must understand the intricacies of binary formats, memory layout, and the instruction set of the given processor. We therefore become masters of the very life given to a program on a microprocessor. A reverse engineer is skilled in the art of binary mastery. This book is going to give you the proper lessons, insight, and tasks required to become a Linux binary hacker. When someone can call themselves a reverse engineer, they elevate themselves beyond the level of just engineering. A true hacker can not only write code but also dissect code, disassembling the binaries and memory segments in pursuit of modifying the inner workings of a software program; now that is power…

On both a professional and a hobbyist level, I use my reverse engineering skills in the computer security field, whether it is vulnerability analysis, malware analysis, antivirus software, rootkit detection, or virus design. Much of this book will be focused towards computer security. We will analyze memory dumps, reconstruct process images, and explore some of the more esoteric regions of binary analysis, including Linux virus infection and binary forensics. We will dissect malware-infected executables and infect running processes. This book is aimed at explaining the necessary components for reverse engineering in Linux, so we will be going deep into learning ELF (executable and linking format), which is the binary format used in Linux for executables, shared libraries, core dumps, and object files. One of the most significant aspects of this book is the deep insight it gives into the structural complexities of the ELF binary format. The ELF sections, segments, and dynamic linking concepts are vital and exciting chunks of knowledge. We will explore the depths of hacking ELF binaries and see how these skills can be applied to a broad spectrum of work.

The goal of this book is to teach you to be one of the few people with a strong foundation in Linux binary hacking, which will be revealed as a vast topic that opens the door to innovative research and puts you on the cutting edge of low-level hacking in the Linux operating system. You will walk away with valuable knowledge of Linux binary (and memory) patching, virus engineering/analysis, kernel forensics, and the ELF binary format as a whole. You will also gain more insights into program execution and dynamic linking and achieve a higher understanding of binary protection and debugging internals.

I am a computer security researcher, software engineer, and hacker. This book is merely an organized observation and documentation of the research I have done and the foundational knowledge that has manifested as a result.

This knowledge covers a wide span of information that can't be found in any one place on the Internet. This book tries to bring many interrelated topics together into one piece so that it may serve as an introductory manual and reference to the subject of Linux binary and memory hacking. It is by no means a complete reference but does contain a lot of core information to get started with.

What this book covers

Chapter 1, The Linux Environment and Its Tools, gives a brief description of the Linux environment and its tools, which we will be using throughout the book.

Chapter 2, The ELF Binary Format, helps you learn about every major component of the ELF binary format that is used across Linux and most Unix-flavored operating systems.

Chapter 3, Linux Process Tracing, teaches you to use the ptrace system call to read and write to process memory and inject code.

Chapter 4, ELF Virus Technology – Linux/Unix Viruses, is where you discover the past, present, and future of Linux viruses, how they are engineered, and all of the amazing research that surrounds them.

Chapter 5, Linux Binary Protection, explains the basic internals of ELF binary protection.

Chapter 6, ELF Binary Forensics in Linux, is where you learn to dissect ELF objects in search of viruses, backdoors, and suspicious code injection.

Chapter 7, Process Memory Forensics, shows you how to dissect a process address space in search of malware, backdoors, and suspicious code injection that live in the memory.

Chapter 8, ECFS – Extended Core File Snapshot Technology, is an introduction to ECFS, a new open source product for deep process memory forensics.

Chapter 9, Linux /proc/kcore Analysis, shows how to detect Linux kernel malware through memory analysis with /proc/kcore.

What you need for this book

The prerequisites for this book are as follows: we will assume that you have a working knowledge of the Linux command line, comprehensive C programming skills, and a very basic grasp on the x86 assembly language (this is helpful but not necessary). There is a saying, "If you can read assembly language then everything is open source."

Who this book is for

If you are a software engineer or reverse engineer and want to learn more about Linux binary analysis, this book will provide you with all that you need to implement solutions for binary analysis in areas of security, forensics, and antiviruses. This book is great for both security enthusiasts and system-level engineers. Some experience with the C programming language and the Linux command line is assumed.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. The Linux Environment and Its Tools

In this chapter, we will be focusing on the Linux environment as it pertains to our focus throughout this book. Since this book is focused about Linux binary analysis, it makes sense to utilize the native environment tools that come with Linux and to which everyone has access. Linux comes with the ubiquitous binutils already installed, but they can be found at http://www.gnu.org/software/binutils/. They contain a huge selection of tools that are handy for binary analysis and hacking. This is not another book on using IDA Pro. IDA is hands-down the best universal software for reverse engineering of binaries, and I would encourage its use as needed, but we will not be using it in this book. Instead, you will acquire the skills to hop onto virtually any Linux system and have an idea on how to begin hacking binaries with an environment that is already accessible. You can therefore learn to appreciate the beauty of Linux as a true hackers' environment for which there are many free tools available. Throughout the book, we will demonstrate the use of various tools and give a recap on how to use them as we progress through each chapter. Meanwhile, however, let this chapter serve as a primer or reference to these tools and tips within the Linux environment. If you are already very familiar with the Linux environment and its tools for disassembling, debugging, and parsing of ELF files, then you may simply skip this chapter.

Linux tools

Throughout this book, we will be using a variety of free tools that are accessible by anyone. This section will give a brief synopsis of some of these tools for you.

GDB

GNU Debugger (GDB) is not only good to debug buggy applications. It can also be used to learn about a program's control flow, change a program's control flow, and modify the code, registers, and data structures. These tasks are common for a hacker who is working to exploit a software vulnerability or is unraveling the inner workings of a sophisticated virus. GDB works on ELF binaries and Linux processes. It is an essential tool for Linux hackers and will be used in various examples throughout this book.

Objdump from GNU binutils

Object dump (objdump) is a simple and clean solution for a quick disassembly of code. It is great for disassembling simple and untampered binaries, but will show its limitations quickly when attempting to use it for any real challenging reverse engineering tasks, especially against hostile software. Its primary weakness is that it relies on the ELF section headers and doesn't perform control flow analysis, which are both limitations that greatly reduce its robustness. This results in not being able to correctly disassemble the code within a binary, or even open the binary at all if there are no section headers. For many conventional tasks, however, it should suffice, such as when disassembling common binaries that are not fortified, stripped, or obfuscated in any way. It can read all common ELF types. Here are some common examples of how to use objdump:

View all data/code in every section of an ELF file:
objdump -D <elf_object>
View only program code in an ELF file:
objdump -d <elf_object>
View all symbols:
objdump -tT <elf_object>

We will be exploring objdump and other tools in great depth during our introduction to the ELF format in Chapter 2, The ELF Binary Format.

Objcopy from GNU binutils

Object copy (Objcopy) is an incredibly powerful little tool that we cannot summarize with a simple synopsis. I recommend that you read the manual pages for a complete description. Objcopy can be used to analyze and modify ELF objects of any kind, although some of its features are specific to certain types of ELF objects. Objcopy is often times used to modify or copy an ELF section to or from an ELF binary.

To copy the .data section from an ELF object to a file, use this line:

objcopy –only-section=.data <infile> <outfile>

The objcopy tool will be demonstrated as needed throughout the rest of this book. Just remember that it exists and can be a very useful tool for the Linux binary hacker.

strace

System call trace (strace) is a tool that is based on the ptrace(2) system call, and it utilizes the PTRACE_SYSCALL request in a loop to show information about the system call (also known as syscalls) activity in a running program as well as signals that are caught during execution. This program can be highly useful for debugging, or just to collect information about what syscalls are being called during runtime.

This is the strace command used to trace a basic program:

strace /bin/ls -o ls.out

The strace command used to attach to an existing process is as follows:

strace -p <pid> -o daemon.out

The initial output will show you the file descriptor number of each system call that takes a file descriptor as an argument, such as this:

SYS_read(3, buf, sizeof(buf));

If you want to see all of the data that was being read into file descriptor 3, you can run the following command:

strace -e read=3 /bin/ls

You may also use -e write=fd to see written data. The strace tool is a great little tool, and you will undoubtedly find many reasons to use it.

ltrace

library trace (ltrace) is another neat little tool, and it is very similar to strace. It works similarly, but it actually parses the shared library-linking information of a program and prints the library functions being used.

Basic ltrace command

You may see system calls in addition to library function calls with the -S flag. The ltrace command is designed to give more granular information, since it parses the dynamic segment of the executable and prints actual symbols/functions from shared and static libraries:

ltrace <program> -o program.out

ftrace

Function trace (ftrace) is a tool designed by me. It is similar to ltrace, but it also shows calls to functions within the binary itself. There was no other tool I could find publicly available that could do this in Linux, so I decided to code one. This tool can be found at https://github.com/elfmaster/ftrace. A demonstration of this tool is given in the next chapter.

readelf

The readelf command is one of the most useful tools around for dissecting ELF binaries. It provides every bit of the data specific to ELF necessary for gathering information about an object before reverse engineering it. This tool will be used often throughout the book to gather information about symbols, segments, sections, relocation entries, dynamic linking of data, and more. The readelf command is the Swiss Army knife of ELF. We will be covering it in depth as needed, during Chapter 2, The ELF Binary Format, but here are a few of its most commonly used flags:

To retrieve a section header table:
readelf -S <object>
To retrieve a program header table:
readelf -l <object>
To retrieve a symbol table:
readelf -s <object>
To retrieve theELF file header data:
readelf -e <object>
To retrieve relocation entries:
readelf -r <object>
To retrieve a dynamic segment:
readelf -d <object>

ERESI – The ELF reverse engineering system interface

ERESI project (http://www.eresi-project.org) contains a suite of many tools that are a Linux binary hacker's dream. Unfortunately, many of them are not kept up to date and aren't fully compatible with 64-bit Linux. They do exist for a variety of architectures, however, and are undoubtedly the most innovative single collection of tools for the purpose of hacking ELF binaries that exist today. Because I personally am not really familiar with using the ERESI project's tools, and because they are no longer kept up to date, I will not be exploring their capabilities within this book. However, be aware that there are two Phrack articles that demonstrate the innovation and powerful features of the ERESI tools:

Cerberus ELF interface (http://www.phrack.org/archives/issues/61/8.txt)Embedded ELF debugging (http://www.phrack.org/archives/issues/63/9.txt)

Useful devices and files

Linux has many files, devices, and /proc entries that are very helpful for the avid hacker and reverse engineer. Throughout this book, we will be demonstrating the usefulness of many of these files. Here is a description of some of the commonly used ones throughout the book.

/proc/<pid>/maps

/proc/<pid>/maps file contains the layout of a process image by showing each memory mapping. This includes the executable, shared libraries, stack, heap, VDSO, and more. This file is critical for being able to quickly parse the layout of a process address space and is used more than once throughout this book.

/proc/kcore

The /proc/kcore is an entry in the proc filesystem that acts as a dynamic core file of the Linux kernel. That is, it is a raw dump of memory that is presented in the form of an ELF core file that can be used by GDB to debug and analyze the kernel. We will explore /proc/kcore in depth in Chapter 9, Linux /proc/kcore Analysis.

/boot/System.map

This file is available on almost all Linux distributions and is very useful for kernel hackers. It contains every symbol for the entire kernel.

/proc/kallsyms

The kallsyms is very similar to System.map, except that it is a /proc entry that means that it is maintained by the kernel and is dynamically updated. Therefore, if any new LKMs are installed, the symbols will be added to /proc/kallsyms on the fly. The /proc/kallsyms contains at least most of the symbols in the kernel and will contain all of them if specified in the CONFIG_KALLSYMS_ALL kernel config.

/proc/iomem

The iomem is a useful proc entry as it is very similar to /proc/<pid>/maps, but for all of the system memory. If, for instance, you want to know where the kernel's text segment is mapped in the physical memory, you can search for the Kernel string and you will see the code/text segment, the data segment, and the bss segment:

$ grep Kernel /proc/iomem 01000000-016d9b27 : Kernel code 016d9b28-01ceeebf : Kernel data 01df0000-01f26fff : Kernel bss

ECFS

Extended core file snapshot (ECFS) is a special core dump technology that was specifically designed for advanced forensic analysis of a process image. The code for this software can be found at https://github.com/elfmaster/ecfs. Also, Chapter 8, ECFS – Extended Core File Snapshot Technology, is solely devoted to explaining what ECFS is and how to use it. For those of you who are into advanced memory forensics, you will want to pay close attention to this.

Linker-related environment points

The dynamic loader/linker and linking concepts are inescapable components involved in the process of program linking and execution. Throughout this book, you will learn a lot about these topics. In Linux, there are quite a few ways to alter the dynamic linker's behavior that can serve the binary hacker in many ways. As we move through the book, you will begin to understand the process of linking, relocations, and dynamic loading (program interpreter). Here are a few linker-related attributes that are useful and will be used throughout the book.

The LD_PRELOAD environment variable

The LD_PRELOAD environment variable can be set to specify a library path that should be dynamically linked before any other libraries. This has the effect of allowing functions and symbols from the preloaded library to override the ones from the other libraries that are linked afterwards. This essentially allows you to perform runtime patching by redirecting shared library functions. As we will see in later chapters, this technique can be used to bypass anti-debugging code and for userland rootkits.

The LD_SHOW_AUXV environment variable

This environment variable tells the program loader to display the program's auxiliary vector during runtime. The auxiliary vector is information that is placed on the program's stack (by the kernel's ELF loading routine), with information that is passed to the dynamic linker with certain information about the program. We will examine this much more closely in Chapter 3, Linux Process Tracing, but the information might be useful for reversing and debugging. If, for instance, you want to get the memory address of the VDSO page in the process image (which can also be obtained from the maps file, as shown earlier) you have to look for AT_SYSINFO.

Here is an example of the auxiliary vector with LD_SHOW_AUXV:

$ LD_SHOW_AUXV=1 whoamiAT_SYSINFO: 0xb7779414AT_SYSINFO_EHDR: 0xb7779000AT_HWCAP: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2AT_PAGESZ: 4096AT_CLKTCK: 100AT_PHDR: 0x8048034AT_PHENT: 32AT_PHNUM: 9AT_BASE: 0xb777a000AT_FLAGS: 0x0AT_ENTRY: 0x8048eb8AT_UID: 1000AT_EUID: 1000AT_GID: 1000AT_EGID: 1000AT_SECURE: 0AT_RANDOM: 0xbfb4ca2bAT_EXECFN: /usr/bin/whoamiAT_PLATFORM: i686elfmaster

The auxiliary vector will be covered in more depth in Chapter 2, The ELF Binary Format.

Linker scripts

Linker scripts are a point of interest to us because they are interpreted by the linker and help shape a program's layout with regard to sections, memory, and symbols. The default linker script can be viewed with ld -verbose.

The ld linker program has a complete language that it interprets when it is taking input files (such as relocatable object files, shared libraries, and header files), and it uses this language to determine how the output file, such as an executable program, will be organized. For instance, if the output is an ELF executable, the linker script will help determine what the layout will be and what sections will exist in which segments. Here is another instance: the .bss section is always at the end of the data segment; this is determined by the linker script. You might be wondering how this is interesting to us. Well! For one, it is important to have some insights into the linking process during compile time. The gcc relies on the linker and other programs to perform this task, and in some instances, it is important to be able to have control over the layout of the executable file. The ld command language is quite an in-depth language and is beyond the scope of this book, but it is worth checking out. And while reverse engineering executables, remember that common segment addresses may sometimes be modified, and so can other portions of the layout. This indicates that a custom linker script is involved. A linker script can be specified with gcc using the -T flag. We will look at a specific example of using a linker script in Chapter 5, Linux Binary Protection.

Summary

We just touched upon some fundamental aspects of the Linux environment and the tools that will be used most commonly in the demonstrations from each chapter. Binary analysis is largely about knowing the tools and resources that are available for you and how they all fit together. We only briefly covered the tools, but we will get an opportunity to emphasize the capabilities of each one as we explore the vast world of Linux binary hacking in the following chapters. In the next chapter, we will delve into the internals of the ELF binary format and cover many interesting topics, such as dynamic linking, relocations, symbols, sections, and more.

Chapter 2. The ELF Binary Format

In order to reverse-engineer Linux binaries, you must understand the binary format itself. ELF has become the standard binary format for Unix and Unix-flavor OSes. In Linux, BSD variants, and other OSes, the ELF format is used for executables, shared libraries, object files, coredump files, and even the kernel boot image. This makes ELF very important to learn for those who want to better understand reverse engineering, binary hacking, and program execution. Binary formats such as ELF are not generally a quick study, and to learn ELF requires some degree of application of the different components that you learn as you go. Real, hands-on experience is necessary to achieve proficiency. The ELF format is complicated and dry, but can be learned with some enjoyment when applying your developing knowledge of it in reverse engineering and programming tasks. ELF is really quite an incredible composition of computer science at work, with program loading, dynamic linking, symbol table lookups, and many other tightly orchestrated components.

I believe that this chapter is perhaps the most important in this entire book because it will give the reader a much greater insight into topics pertaining to how a program is actually mapped out on disk and loaded into memory. The inner workings of program execution are complicated, and understanding it is valuable knowledge to the aspiring binary hacker, reverse engineer, or low-level programmer. In Linux, program execution implies the ELF binary format.

My approach to learning ELF is through investigation of the ELF specifications as any Linux reverse engineer should, and then applying each aspect of what we learn in a creative way. Throughout this book, you will visit many facets of ELF and see how knowledge of it is pertinent to viruses, process-memory forensics, binary protection, rootkits, and more.

In this chapter, you will cover the following ELF topics:

ELF file typesProgram headersSection headersSymbolsRelocationsDynamic linkingCoding an ELF parser

ELF file types

An ELF file may be marked as one of the following types:

ET_NONE: This