28,99 €
Provides readers with a solid foundation in Arm assembly internals and reverse-engineering fundamentals as the basis for analyzing and securing billions of Arm devices Finding and mitigating security vulnerabilities in Arm devices is the next critical internet security frontier--Arm processors are already in use by more than 90% of all mobile devices, billions of Internet of Things (IoT) devices, and a growing number of current laptops from companies including Microsoft, Lenovo, and Apple. Written by a leading expert on Arm security, Blue Fox: Arm Assembly Internals and Reverse Engineering introduces readers to modern Armv8-A instruction sets and the process of reverse-engineering Arm binaries for security research and defensive purposes. Divided into two sections, the book first provides an overview of the ELF file format and OS internals, followed by Arm architecture fundamentals, and a deep-dive into the A32 and A64 instruction sets. Section Two delves into the process of reverse-engineering itself: setting up an Arm environment, an introduction to static and dynamic analysis tools, and the process of extracting and emulating firmware for analysis. The last chapter provides the reader a glimpse into macOS malware analysis of binaries compiled for the Arm-based M1 SoC. Throughout the book, the reader is given an extensive understanding of Arm instructions and control-flow patterns essential for reverse engineering software compiled for the Arm architecture. Providing an in-depth introduction into reverse-engineering for engineers and security researchers alike, this book: * Offers an introduction to the Arm architecture, covering both AArch32 and AArch64 instruction set states, as well as ELF file format internals * Presents in-depth information on Arm assembly internals for reverse engineers analyzing malware and auditing software for security vulnerabilities, as well as for developers seeking detailed knowledge of the Arm assembly language * Covers the A32/T32 and A64 instruction sets supported by the Armv8-A architecture with a detailed overview of the most common instructions and control flow patterns * Introduces known reverse engineering tools used for static and dynamic binary analysis * Describes the process of disassembling and debugging Arm binaries on Linux, and using common disassembly and debugging tools Blue Fox: Arm Assembly Internals and Reverse Engineering is a vital resource for security researchers and reverse engineers who analyze software applications for Arm-based devices at the assembly level.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 656
Veröffentlichungsjahr: 2023
Cover
Title Page
Introduction
Notes
Part I: Arm Assembly Internals
Chapter 1: Introduction to Reverse Engineering
Introduction to Assembly
High‐Level Languages
Disassembling
Decompilation
Notes
Chapter 2: ELF File Format Internals
Program Structure
High‐Level vs. Low‐Level Languages
The Compilation Process
The ELF File Overview
The ELF File Header
ELF Program Headers
ELF Section Headers
The Dynamic Section and Dynamic Loading
Thread‐Local Storage
Notes
Chapter 3: OS Fundamentals
OS Architecture Overview
Process Memory Management
Notes
Chapter 4: The Arm Architecture
Architectures and Profiles
The Armv8‐A Architecture
The AArch64 Execution State
The AArch32 Execution State
Notes
Chapter 5: Data Processing Instructions
Shift and Rotate Operations
Logical Operations
Arithmetic Operations
Multiplication Operations
Division Operations
Move Operations
Notes
Chapter 6: Memory Access Instructions
Instructions Overview
Addressing Modes and Offset Forms
Load and Store Instructions
Notes
Chapter 7: Conditional Execution
Conditional Execution Overview
Conditional Codes
Conditional Instructions
Flag‐Setting Instructions
Conditional Select Instructions
Conditional Comparison Instructions
Notes
Chapter 8: Control Flow
Branch Instructions
Functions and Subroutines
Notes
Part II: Reverse Engineering
Chapter 9: Arm Environments
Arm Boards
Emulation with QEMU
Notes
Chapter 10: Static Analysis
Static Analysis Tools
Call‐By‐Reference Example
Control Flow Analysis
Analyzing an Algorithm
Notes
Chapter 11: Dynamic Analysis
Command‐Line Debugging
Remote Debugging
Debugging a Memory Corruption
Debugging a Process with GDB
Notes
Chapter 12: Reversing arm64 macOS Malware
Background
Hunting for Malicious arm64 Binaries
Analyzing arm64 Malware
Conclusion
Notes
Index
Copyright
Dedication
About the Authors
Acknowledgments
End User License Agreement
Chapter 1
Table 1.1: Addition and Subtraction Opcodes
Table 1.2: Mnemonics
Table 1.3: Manually Assigning the Machine Codes
Table 1.4: Programming the Machine Codes
Chapter 2
Table 2.1: Programming in Assembly Use Cases
Table 2.2: GCC Cross‐Compilers
Table 2.3: Needed Object Files and Their Purpose
Table 2.4: Arm 32‐Bit
e_flags
Values
Table 2.5: RELRO Options
Table 2.6: Symbol Types
Table 2.7: Mapping Symbols
Table 2.8: TLS Models
Table 2.9: Basic TLS Relocation Types for Arm ELF Files
Chapter 3
Table 3.1: Memory Protection Permissions
Table 3.2: Permission Attributes
Table 3.3: Entropy Comparison of ASLR Implementations
Chapter 4
Table 4.1: Vector Offsets from Vector Table Base Address
Table 4.2: A64 Special Registers
Table 4.3: J and T Bit Instruction Modes for the A32 and T32 States
Table 4.4: AArch32 Register Aliases
Table 4.5: J and T Bit Instruction Modes for the A32 and T32 States
Table 4.6: AArch32 Mode Bit Encodings
Chapter 5
Table 5.1: Syntax Symbols
Table 5.2: Shift and Rotate Instructions: Immediate Form
Table 5.3: Shift and Rotate Instructions: Register Form
Table 5.4: Syntax Symbols
Table 5.5: A64 Bitfield Move Instructions
Table 5.6: A64 Bitfield Move Instruction Aliases
Table 5.7: A64 Extract Register Instruction Aliases
Table 5.8: A64 Extend Instructions
Table 5.9: A32 Bitfield Extend Forms
Table 5.10: A32 Sign‐ and Zero‐Extend Instructions
Table 5.11: Bitfield Extract and Insert Instructions
Table 5.12: A64 Bitfield Move Instructions
Table 5.13: Truth Table of AND Operations
Table 5.14: Bitwise AND Operations
Table 5.15: A64 Bitwise AND Instruction Aliases
Table 5.16: Bitwise Bit Clear Instruction Syntax
Table 5.17: Truth Table of OR Operations
Table 5.18: Bitwise OR Instruction Syntax
Table 5.19: Truth Table of NOT OR Operations
Table 5.20: Bitwise OR NOT Instruction Syntax
Table 5.21: Shifted Register Form of the Bitwise OR NOT Instruction
Table 5.22: Truth Table of Exclusive OR Operations
Table 5.23: Bitwise Exclusive OR Instruction Syntax
Table 5.24: Bitwise Exclusive OR NOT Instruction Syntax
Table 5.25: Syntax Symbols
Table 5.26: ADD and SUB Instruction Forms
Table 5.27: A32 RBS Instruction Forms
Table 5.28: Compare (CMP) Instruction Forms
Table 5.29: A64 Compare Negative (CMN) Instruction Forms and Aliases
Table 5.30: General Integer Multiply Instructions
Table 5.31: A64 Signed and Unsigned Multiply Instructions
Table 5.32: A32 Multiply Instructions
Table 5.33: A32 Least Significant Word Multiplications
Table 5.34: A32 Most Significant Word Multiplications
Table 5.35: A32 Halfword Multiplications
Table 5.36: A32 Signed Multiply Halfword Instructions
Table 5.37: A32 Signed Multiply Accumulate Halfword Instructions
Table 5.38: A32 Signed Multiply Accumulate Word by Halfword Instructions
Table 5.39: A32 Multiply Accumulate Word by Halfword Instructions
Table 5.40: A32 Signed Dual Multiply Add Instructions
Table 5.41: A32 Signed Dual Multiply Subtract Instructions
Table 5.42: A32 Signed Multiply Accumulate Dual Instructions
Table 5.43: A32 Signed Multiply Subtract Dual Instructions
Table 5.44: A32 Multiply Long Overview
Table 5.45: A32 Signed Multiply Long Instructions
Table 5.46: A32 Multiply Accumulate Long Instructions
Table 5.47: A32 Unsigned Multiply Accumulate Accumulate Long Instruction
Table 5.48: A32 Signed Multiply Accumulate Long Halfwords Instructions
Table 5.49: A32 Signed Multiply Accumulate Long Dual Instructions
Table 5.50: A32 Multiply Subtract Long Dual Instructions
Table 5.51: Divide Instructions Overview
Table 5.52: Syntax Symbols
Table 5.53: A32 Move Immediate Instructions
Table 5.54: A64 Move Immediate Instructions
Table 5.55: A32 and A64 Move Register Instructions
Table 5.56: Move with NOT Instruction Syntax
Chapter 6
Table 6.1: Syntax Symbols
Table 6.2: Addressing Mode Summary
Table 6.3: A32 Single Register Addressing Modes and Offset Forms
Table 6.4: A64 Single Register Addressing Modes and Offset Forms
1
Table 6.5: Offset Addressing Mode with Offset Forms
Table 6.6: A32 Immediate Offset Ranges
Table 6.7: A64 Scaled Immediate Offset Ranges
Table 6.8: A64 Unscaled Immediate Offset Ranges
Table 6.9: A64 Scaled and Unscaled Offset Instructions
Table 6.10: Register Offset Forms
Table 6.11: Pre‐Indexed Mode Syntax
Table 6.12: Examples of Pre‐Indexed Addressing
Table 6.13: Post‐Indexed Mode Syntax
Table 6.14: Examples of Post‐Indexed Addressing
Table 6.15: LDR Literal Pool Locality Requirements
7
Table 6.16: A32 Load/Store Word or Doubleword
Table 6.17: A64 Load/Store Word or Doubleword
Table 6.18: A64 Load Signed Word
Table 6.19: A32 and A4 Load/Store Halfword Examples
Table 6.20: A32 and A4 Load/Store Byte Examples
Table 6.21: A32 LDM/STM Syntax
Table 6.22: A32 Load/Store Multiple Syntax
Table 6.23: A32 Equivalents
Table 6.24: A32 PUSH and POP Syntax
Table 6.25: A64 Load/Store Instruction Types and Their Offset Forms
Table 6.26: A64 Load/Store Pair Addressing and Offset
Table 6.27: A64 LDP/STP Instruction Syntax
Table 6.28: A64 LDPSW Instruction Syntax
Chapter 7
Table 7.1: Condition Codes
Table 7.2: Condition Codes and Their Inverse
Table 7.3: Test and Comparison Instructions
Table 7.4: CMP Instruction Forms
Table 7.5: TST Instruction Forms
Table 7.6: TST Instruction Forms
Table 7.7: Conditional Select Group Instruction Behavior
Chapter 8
Table 8.1: Immediate Branches
Table 8.2: Register Branches
Table 8.3: Conditional Branch Instructions
Table 8.4: Conditional Branches for Signed and Unsigned Numbers
Table 8.5: If‐Else Assembly Examples
Table 8.6:
While
Loop Assembly Examples
Table 8.7:
Do‐While
Loop Assembly Examples
Table 8.8:
For
Loop Assembly Examples
Table 8.9: Compare and Branch Instructions
Table 8.10: A64 Test and Branch Instructions
Table 8.11: T32‐Only Conditional Branches
Table 8.12: Encoding Table for Data‐Processing Instruction Groups
Table 8.13: A32 Branch and Exchange Instructions
Table 8.14: Subroutine Call Instructions
Table 8.15: ABI Standards
Table 8.16: A64 General‐Purpose Registers and AAPCS64 Usage
Table 8.17: A32 General‐Purpose Registers and AAPCS32 Usage
Table 8.18: Volatile and Nonvolatile Registers
Table 8.19: Byte Size of Integral Data Types
Chapter 10
Table 10.1: ASCII Table
Table 10.2: ASCII Table
Table 10.3: ASCII Table
Table 10.4: ASCII Table
Chapter 11
Table 11.1: Essential GDB Commands
Table 11.2: :Useful GEF Commands
Table 11.3: Radare2 Command‐Line Utilities
Table 11.4: Radare2 Shortcuts for Visual Mode
Chapter 12
Table 12.1: Search Modifiers
Chapter 1
Figure 1.1: Letters A, R, and M and their hexadecimal values
Figure 1.2: Hexadecimal ASCII values and their 8‐bit binary equivalents
Figure 1.3: 16‐bit Thumb encoding of ADD and SUB immediate instruction
Figure 1.4: LDR instruction loading a value from the address in R2 to regist...
Figure 1.5: Program flow of an example assembly program
Figure 1.6: Illustration of ADR and LDR instruction logic
Figure 1.7: Source code of
file_record
function in the
ihex2fw.c
source file...
Figure 1.8: IDA 7.6 decompilation output of the compiled
file_record
functio...
Figure 1.9: Ghidra 10.0.4. decompilation output of the compiled file_record ...
Chapter 2
Figure 2.1: Overview of compilation
Figure 2.2: Thread‐local versus global variables
Figure 2.3: Runtime mechanism for thread‐local storage
Chapter 3
Figure 3.1: The command
htop
Figure 3.2:
atop
output
Figure 3.3: Calling a function inside the
libc
library
Figure 3.4: The command
htop –u root
Figure 3.5: The
ps
command
Figure 3.6: Resolving handles
Figure 3.7: Threads running
Figure 3.8: Three user‐mode threads in progress
Figure 3.9: Stack implementations
Chapter 4
Figure 4.1: Exception levels illustrated with “secure” state and “non‐secure...
Figure 4.2: Illustration of exception level components on a TrustZone enable...
Figure 4.3: Illustration of SVC, HVC, and SMC calls in their respective exce...
Figure 4.4: Illustration of an SMC exception entry and return
Figure 4.5: Example illustration of 32‐bit and 64‐bit applications running o...
Figure 4.6: Xn and Wn register width
Figure 4.7: Vn register widths
Figure 4.8: PSTATE register components
Figure 4.9: Abstract view of instruction set state switches
Figure 4.10: Overview of AArch32 registers in their respective modes
Figure 4.11: Abstract overview of CSPR bits and their meaning
Figure 4.12: The ASPR components of the CSPR
Figure 4.13: Instruction set state bits of the CPSR
Figure 4.14: IT bit locations in the CSPR
Figure 4.15: Endianness bit location in the CPSR
Figure 4.16: Mode bits in the CPSR
Figure 4.17: Exception mask bits in the CPSR
Chapter 5
Figure 5.1:
ADD
instruction
Figure 5.2: Logical shift left operation
Figure 5.3: Logical shift right operation
Figure 5.4: Arithmetic shift right operation
Figure 5.5: Rotate right operation
Figure 5.6: Rotate right with extend
Figure 5.7: An
SBFM
instruction
Figure 5.8: Shifting the value by 3 bits
Figure 5.9: Unsigned bitfield move (
UBFM
) instruction
Figure 5.10:
LSR
operation
Figure 5.11: Example extract operation
Figure 5.12: Illustration of EXTR instruction in line 7
Figure 5.13:
ROR
instruction
Figure 5.14: 8‐to‐32‐bit sign extension via SXTB instruction
Figure 5.15: Top 16 bits of the result cleared
Figure 5.16: Difference between UXTW and SXTW
Figure 5.17:
ADD
instruction with UXTB operand
Figure 5.18: Bitfield insert (
BFI
)
Figure 5.19: Bitfield insert and extract instructions
Figure 5.20: The
MUL
instruction
Figure 5.21: The multiply and accumulate (
MLA
) instruction
Figure 5.22: The multiply and subtract (
MLS
) instruction
Figure 5.23: The
SMMUL
instruction
Figure 5.24: The
SMMLA
instruction
Figure 5.25: The
SMMLS
instruction
Figure 5.26: Signed multiply halfword group of instructions
SMULBB
,
SMULBT
,
Figure 5.27: Signed multiply accumulate halfword group of instructions
SMLAB
...
Figure 5.28: Signed multiply word by halfword instruction
Figure 5.29: Signed multiply accumulate word by halfword instruction group
Figure 5.30: The
SMUAD
instruction
Figure 5.31: The
SMUSD
instruction
Figure 5.32: The signed multiply accumulate dual (
SMLAD
) instruction
Figure 5.33: The signed multiply subtract dual (
SMLSD
) instruction
Figure 5.34: The signed multiply long (
SMULL
) instruction
Figure 5.35: Result split between
r5
and
r6
Figure 5.36: A32/T32 multiply accumulate long instruction
Figure 5.37: UMAAL instruction
Figure 5.38: The signed multiply accumulate long halfwords instruction group...
Figure 5.39:
SMLALD
Figure 5.40: The signed multiply subtract long dual instruction
Figure 5.41: Move instructions
Chapter 6
Figure 6.1: LDR instruction
Figure 6.2: STR instruction
Figure 6.3: A32 LDR immediate instruction encoding
Figure 6.4: A32 LDRH immediate instruction encoding
Figure 6.5: A64 LDR immediate instruction encoding
Figure 6.6: A64 LDUR immediate instruction encoding
Figure 6.7: A32 LDR pre‐indexed addressing illustration
Figure 6.8: A32 post‐indexed addressing illustration
Figure 6.9: MOV encoding with #384 immediate
Figure 6.10: MOV encoding with #370 immediate
Figure 6.11: PC‐relative offset illustration
Figure 6.12: Available addressing modes and offset forms for A32/T32 load an...
Figure 6.13: Available addressing modes and offset forms available for speci...
Figure 6.14: Assembly string illustration
Figure 6.15: Replacing X with zero using STRB instruction
Figure 6.16: Illustration of previous STR example
Figure 6.17: STR and STM instruction logic
Figure 6.18: STM instruction example
Figure 6.19: STM instruction example with SP update
Figure 6.20: LDMIA and LDMIB instruction example
Figure 6.21: LDMDA and LDMDB instruction example
Figure 6.22: LDM and STM equivalent forms of PUSH and POP
Figure 6.23: A64 STP base and base with offset example
Figure 6.24: A64 STP with post‐ and pre‐indexed addressing
Figure 6.25: A64 LDP 32‐bit variant
Figure 6.26: A64 LDPSW illustration
Chapter 7
Figure 7.1: Condition flag bits in PSTATE
Figure 7.2: Carry over illustration
Figure 7.3: Signed overflow illustration with truth table
Figure 7.4:
ITSTATE
bits in
PSTATE
Figure 7.5: Instructions with S suffix
Figure 7.6: How flags are updated based on ADDS example
Figure 7.7: Signed overflow illustration
Figure 7.8: PSTATE flags set based on an LSLS instruction example
Figure 7.9: CMP logic with SUBS equivalent
Figure 7.10: CMN logic with ADDS equivalent
Figure 7.11: TST logic with ANDS equivalent
Figure 7.12: Illustration of TST and MOVNE instruction behavior
Figure 7.13: TST instruction components
Figure 7.14: NE condition code in the context of TST
Figure 7.15: TEQ instruction logic
Figure 7.16: Semantic meaning of CMP instruction
Figure 7.17: Semantic meaning of EQ after CMP
Figure 7.18: CSEL meaning
Figure 7.19: Final result of CMP and CSEL instruction
Figure 7.20: Decision tree of Boolean statement
Figure 7.21: EQ and NE condition
Figure 7.22: Illustration of instruction logic
Figure 7.23: Instruction logic based on LT and GE conditions
Figure 7.24: Decision tree using Boolean‐or connector
Figure 7.25: CMP decision tree
Figure 7.26: CCMP condition
Figure 7.27: CCMP instruction with nzcv value
Chapter 8
Figure 8.1: Conditional branch example
Figure 8.2: Before and after
Figure 8.3: Instruction encoding
Figure 8.4: Instruction encoding component
Figure 8.5: T32 vs. A32 instruction encoding translation
Figure 8.6: Switch to Thumb
Figure 8.7: Subroutine call via BL instruction (A32)
Figure 8.8: Subroutine call via BLX instruction (A32)
Figure 8.9: A64 subroutine branch
Figure 8.10: Subroutine call with arguments
Figure 8.11: Argument registers for two 64‐bit integers
Figure 8.12: Argument registers for three 32‐bit integers and one 64‐bit int...
Figure 8.13: Setting up arguments in assembly using MOV and LDRD instruction...
Figure 8.14: Storing doubleword from registers r3 and r4 using the STRD inst...
Figure 8.15: Leaf function return via branch to LR
Figure 8.16: Nonleaf function call preserving LR value
Figure 8.17: Function prologue illustration
Figure 8.18: Stack frame adjustment
Figure 8.19: Function epilogue
Figure 8.20: Screencap of
Godbolt.org
Chapter 9
Figure 9.1: Admin interface
Chapter 10
Figure 10.1: Main function view in Binary Ninja
Figure 10.2: Binary Ninja display options
Figure 10.3: Displaying strings
Figure 10.4: Triage feature
Figure 10.5: Low Level IL
Figure 10.6: Medium‐Level IL
Figure 10.7: High Level IL
Figure 10.8: Conditional flow graph view
Figure 10.9: Medium IL graph view
Figure 10.10: Initialization of variables
a
and
b
Figure 10.11: Argument preparation for the swap function
Figure 10.12: Swap function being called and arguments stored at their dedic...
Figure 10.13: Variable stack locations
Figure 10.14: Dereferencing pointers and setting *pa value to *pb value
Figure 10.15: Memory addresses for variable a and b now contain the same val...
Figure 10.16: Set the value pointed to by pb to the value of t.
Figure 10.17: Changes to contents of the variable locations
Figure 10.18: Argument preparation for the printf call
Figure 10.19: Disassembly output of the Main function
Figure 10.20: Renaming local variables in IDA Pro
Figure 10.21: Stack layout
Figure 10.22: Start of the
decimal2Hexadecimal
function
Figure 10.23: Calculating remainder value via loaded quotient value
Figure 10.24: Instructions and register values
Figure 10.25: Conditional branch based on If‐Else statement
Figure 10.26:
if
statement
Figure 10.27: Stack
Figure 10.28: If statement in disassembly
Figure 10.29: Storing the first result into the array
Figure 10.30: Dividing the quotient
Figure 10.31: Dividing the quotient; disassembly breakdown
Figure 10.32: Checking the condition for the
while
loop
Figure 10.33: Current stack layout
Figure 10.34:
for
loop
Figure 10.35: Printing the character of the
hexadecimalnum
array
Figure 10.36: Disassembly breakdown
Figure 10.37: Order of elements
Figure 10.38: Print line
Figure 10.39: Disassembly view of the
main
function
Figure 10.40: Conditional branch based on the return value of algoFunc
Figure 10.41: Control flow graph of the
algoFunc
function
Figure 10.42: Local variable labels assigned by IDA Pro
Figure 10.43:
loc_8A0
Figure 10.44: Branching to the
loc_8A8
instruction block
Figure 10.45:
SCVTF
instruction
Figure 10.46: Instruction block to compute the
sqrt
Figure 10.47: Other instances where the var_4 value is being accessed
Figure 10.48: Logic
Figure 10.49:
loc_8C4
instruction block with surrounding context
Figure 10.50: Logic
Chapter 11
Figure 11.1: GDB multiarch split display view
Figure 11.2: GEF view when breakpoint hits
Figure 11.3: Examine memory command breakdown
Figure 11.4: Examine two giant words in hexadecimal.
Figure 11.5: Examine 10 bytes in hexadecimal.
Figure 11.6: GEF memory watch command
Figure 11.7: Memory watch of the GOT region
Figure 11.8: Radare2 interactive view
Figure 11.9: Radare2 debugging session view
Figure 11.10: Radare2 control flow view
Figure 11.11: Selecting debugger type in IDA Pro
Figure 11.12: IDA Pro debugging options
Figure 11.13: IDA Pro debugging view
Figure 11.14: Stack view of buffer and return values
Figure 11.15: Buffer overflown and return value overwritten with address of
Chapter 12
Figure 12.1: Anti‐Virus Detections drop for an arm64‐version of a malicious ...
Figure 12.2: Previous search modifiers include benign results.
Figure 12.3: A modified search query returns results with more than two posi...
Figure 12.4: Finding arm64 macOS malware “GoSearch22” on VirusTotal
Figure 12.5: VirusTotal results for Bundlore adware
Figure 12.6: A “required” update seeks to trick users into infecting themsel...
Figure 12.7: The malware contains no calls to the
ptrace
user‐mode API.
Figure 12.8: Searching for the
svc
instruction
Figure 12.9: Objc_msgSend arguments and descriptions from Apple's documentat...
Cover
Table of Contents
Title Page
Copyright
Dedication
About the Authors
Acknowledgments
Introduction
Begin Reading
Index
End User License Agreement
iii
xxi
xxii
xxiii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
305
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
437
438
439
440
441
442
443
444
445
446
447
448
449
iv
v
vii
ix
450
Maria Markstedter
Let's address the elephant in the room: why “Blue Fox”?
This book was originally supposed to contain an overview of the Arm instruction set, chapters on reverse engineering, and chapters on exploit mitigation internals and bypass techniques. The publisher and I soon realized that covering these topics to a satisfactory extent would make this book about 1,000 pages long. For this reason, we decided to split it into two books: Blue Fox and Red Fox.
The Blue Fox edition covers the analyst view; teaching you everything you need to know to get started in reverse engineering. Without a solid understanding of the fundamentals, you can't move to more advanced topics such as vulnerability analysis and exploit development. The Red Fox edition will cover the offensive security view: understanding exploit mitigation internals, bypass techniques, and common vulnerability patterns.
As of this writing, the Arm architecture reference manual for the Armv8‐A architecture (and Armv9‐A extensions) contains 11,952 pages1 and continues to expand. This reference manual was around 8,000 pages2 long when I started writing this book two years ago.
Security researchers who are used to reverse engineering x86/64 binaries but want to adopt to the new era of Arm‐powered devices are having a hard time finding digestible resources on the Arm instruction set, especially in the context of reverse engineering or binary analysis. Arm's architecture reference manual can be both overwhelming and discouraging. In this day and age, nobody has time to read a 12,000‐page deeply technical document, let alone identify the most relevant or most commonly used instructions and memorize them. The truth is that you don't need to know every single Arm instruction to be able to reverse engineer an Arm binary. Many instructions have very specific use cases that you may or may not ever encounter during your analysis.
The purpose of this book is to make it easier for people to get familiar with the Arm instruction set and gain enough knowledge to apply it in their professional lives. I spent countless hours dissecting the Arm reference manual and categorizing the most common instruction types and their syntax patterns so you don't have to. But this book isn't a list of the most common Arm instructions. It contains explanations you won't find anywhere else, not even in the Arm manual itself. The basic descriptions of a given instruction in the Arm manual are rather brief. That is fine for trivial instructions like MOV or ADD. However, many common instructions perform complex operations that are difficult to understand from their descriptions alone. For this reason, many of the instructions you will encounter in this book are accompanied by graphical illustrations explaining what is actually happening under the hood.
If you're a beginner in reverse engineering, it is important to understand the binary's file format, its sections, how it compiles from source code into machine code, and the environment it depends on. Because of limited space and time, this book cannot cover every file format and operating system. It instead focuses on Linux environments and the ELF file format. The good news is, regardless of platform or file format, Arm instructions are Arm instructions. Even if you reverse engineer an Arm binary compiled for macOS or Windows, the meaning of the instructions themselves remains the same.
This book begins with an introduction explaining what instructions are and where they come from. In the second chapter, you will learn about the ELF file format and its sections, along with a basic overview of the compilation process. Since binary analysis would be incomplete without understanding the context they are executed in, the third chapter provides an overview of operating system fundamentals.
With this background knowledge, you are well prepared to delve into the Arm architecture in Chapter 4. You can find the most common data processing instructions in Chapter 5, followed by an overview of memory access instructions in Chapter 6. These instructions are a significant part of the Arm architecture, which is also referred to as a Load/Store architecture. Chapters 7 and 8 discuss conditional execution and control flow, which are crucial components of reverse engineering.
Chapter 9 is where it starts to get particularly interesting for reverse engineers. Knowing the different types of Arm environments is crucial, especially when you perform dynamic analysis and need to analyze binaries during execution.
With the information provided so far, you are already well equipped for your next reverse engineering adventure. To get you started, Chapter 10 includes an overview of the most common static analysis tools, followed by small practical static analysis examples you can follow step‐by‐step.
Reverse engineering would be boring without dynamic analysis to observe how a program behaves during execution. In Chapter 11, you will learn about the most common dynamic analysis tools as well as examples of useful commands you can use during your analysis. This chapter concludes with two practical debugging examples: debugging a memory corruption vulnerability and debugging a process in GDB.
Reverse engineering is useful for a variety of use cases. You can use your knowledge of the Arm instruction set and reverse engineering techniques to expand your skill set into different areas, such as vulnerability analysis or malware analysis.
Reverse engineering is an invaluable skill for malware analysts, but they also need to be familiar with the environment a given malware sample was compiled for. To get you started in this area, this book includes a chapter on analyzing arm64 macOS malware (Chapter 12) written by Patrick Wardle, who is also the author of The Art of Mac Malware.3 Unlike previous chapters, this chapter does not focus on Arm assembly. Instead, it introduces you to common anti‐analysis techniques that macOS malware uses to avoid being analyzed. The purpose of this chapter is to provide an introduction to macOS malware compatible with Apple Silicon (M1/M2) so that anyone interested in hunting and analyzing Arm‐based macOS malware can get a head start.
This book took a little over two years to write. I began writing in March 2020, when the pandemic hit and put us all in quarantine. Two years and a lot of sweat and tears later, I'm happy to finally see it come to life. Thank you for putting your faith in me. I hope that this book will serve as a useful guide as you embark on your reverse engineering journey and that it will make the process smoother and less intimidating.
1
(version I.a.)
https://developer.arm.com/documentation/ddi0487/latest
2
(version F.a.)
https://developer.arm.com/documentation/ddi0487/latest
3
https://taomm.org
If you've just picked up this book from the shelf, you're probably interested in learning how to reverse engineer compiled Arm binaries because major tech vendors are now embracing the Arm architecture. Perhaps you're a seasoned veteran of x86‐64 reverse engineering but want to stay ahead of the curve and learn more about the architecture that is starting to take over the processor market. Perhaps you're looking to get started on security analysis to find vulnerabilities in Arm‐based software or analyze Arm‐based malware. Or perhaps you're just getting started in reverse engineering and have hit a point where a deeper level of detail is required to achieve your goal.
Wherever you are on your journey into the Arm‐based universe of reverse engineering, this book is about preparing you, the reader, to understand the language of Arm binaries, showing you how to analyze them, and, more importantly, preparing you for the future of Arm devices.
Learning assembly language and how to analyze compiled software is useful in a wide variety of applications. As with every skill, learning the syntax can seem difficult and complicated at first, but it eventually becomes easier with practice.
In the first part of this book, we'll look at the fundamentals of Arm's main Cortex‐A architecture, specifically the Armv8‐A, and the main instructions you'll encounter when reverse engineering software compiled for this platform. In the second part of the book, we'll look at some common tools and techniques for reverse engineering. To give you inspiration for different applications of Arm‐based reverse engineering, we will look at practical examples, including how to analyze malware compiled for Apple's M1 chip.