35,99 €
TLS is the most widely used cryptographic protocol today, enabling e-commerce, online banking, and secure online communication. Written by Dr. Paul Duplys, Security, Privacy & Safety Research Lead at Bosch, and Dr. Roland Schmitz, Internet Security Professor at Stuttgart Media University, this book will help you gain a deep understanding of how and why TLS works, how past attacks on TLS were possible, and how vulnerabilities that enabled them were addressed in the latest TLS version 1.3. By exploring the inner workings of TLS, you’ll be able to configure it and use it more securely.
Starting with the basic concepts, you’ll be led step by step through the world of modern cryptography, guided by the TLS protocol. As you advance, you’ll be learning about the necessary mathematical concepts from scratch. Topics such as public-key cryptography based on elliptic curves will be explained with a view on real-world applications in TLS. With easy-to-understand concepts, you’ll find out how secret keys are generated and exchanged in TLS, and how they are used to creating a secure channel between a client and a server.
By the end of this book, you’ll have the knowledge to configure TLS servers securely. Moreover, you’ll have gained a deep knowledge of the cryptographic primitives that make up TLS.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 878
Veröffentlichungsjahr: 2024
TLS Cryptography In-Depth
TLS Cryptography In-Depth
Contributors
About the authors
About the reviewers
Preface
Who is this book for
What this book covers
To get the most out of this book
Conventions used
Get in touch
Share Your Thoughts
Download a free PDF copy of this book
Part I Getting Started
Chapter 1: The Role of Cryptography in the Connected World
1.1 Evolution of cryptography
1.2 The advent of TLS and the internet
1.3 Increasing connectivity
1.4 Increasing complexity
1.5 Example attacks
1.6 Summary
Chapter 2: Secure Channel and the CIA Triad
2.1 Technical requirements
2.2 Preliminaries
2.3 Confidentiality
2.4 Integrity
2.5 Authentication
2.6 Secure channels and the CIA triad
2.7 Summary
Chapter 3: A Secret to Share
3.1 Secret keys and Kerckhoffs’s principle
3.2 Cryptographic keys
3.3 Key space
3.4 Key length
3.5 Crypto-agility and information half-life
3.6 Key establishment
3.7 Randomness and entropy
3.8 Summary
Chapter 4: Encryption and Decryption
4.1 Preliminaries
4.2 Symmetric cryptosystems
4.3 Information-theoretical security (perfect secrecy)
4.4 Computational security
4.5 Pseudorandomness
4.6 Summary
Chapter 5: Entity Authentication
5.1 The identity concept
5.2 Authorization and authenticated key establishment
5.3 Message authentication versus entity authentication
5.4 Password-based authentication
5.5 Challenge-response protocols
5.6 Summary
Chapter 6: Transport Layer Security at a Glance
6.1 Birth of the World Wide Web
6.2 Early web browsers
6.3 From SSL to TLS
6.4 TLS overview
6.5 TLS version 1.2
6.6 TLS version 1.3
6.7 Major differences between TLS versions 1.3 and 1.2
6.8 Summary
Part II Shaking Hands
Chapter 7: Public-Key Cryptography
7.1 Preliminaries
7.2 Groups
7.3 The Diffie-Hellman key-exchange protocol
7.4 Security of Diffie-Hellman key exchange
7.5 The ElGamal encryption scheme
7.6 Finite fields
7.7 The RSA algorithm
7.8 Security of the RSA algorithm
7.9 Authenticated key agreement
7.10 Public-key cryptography in TLS 1.3
7.11 Hybrid cryptosystems
7.12 Summary
Chapter 8: Elliptic Curves
8.1 What are elliptic curves?
8.2 Elliptic curves as abelian groups
8.3 Elliptic curves over finite fields
8.4 Security of elliptic curves
8.5 Elliptic curves in TLS 1.3
8.6 Summary
Chapter 9: Digital Signatures
9.1 General considerations
9.2 RSA-based signatures
9.3 Digital signatures based on discrete logarithms
9.4 Digital signatures in TLS 1.3
9.5 Summary
Chapter 10: Digital Certificates and Certification Authorities
10.1 What is a digital certificate?
10.2 X.509 certificates
10.3 Main components of a public-key infrastructure
10.4 Rogue CAs
10.5 Digital certificates in TLS
10.6 Summary
Chapter 11: Hash Functions and Message Authentication Codes
11.1 The need for authenticity and integrity
11.2 What cryptographic guarantees does encryption provide?
11.3 One-way functions
11.4 Hash functions
11.5 Message authentication codes
11.6 MAC versus CRC
11.7 Hash functions in TLS 1.3
11.8 Summary
Chapter 12: Secrets and Keys in TLS 1.3
12.1 Key establishment in TLS 1.3
12.2 TLS secrets
12.3 KDFs in TLS
12.4 Updating TLS secrets
12.5 TLS keys
12.6 TLS key exchange messages
12.7 Summary
Chapter 13: TLS Handshake Protocol Revisited
13.1 TLS client state machine
13.2 TLS server state machine
13.3 Finished message
13.4 Early data
13.5 Post-handshake messages
13.6 OpenSSL s_client
13.7 Summary
Part III Off the Record
Chapter 14: Block Ciphers and Their Modes of Operation
14.1 The big picture
14.2 General principles
14.3 The AES block cipher
14.4 Modes of operation
14.5 Block ciphers in TLS 1.3
14.6 Summary
Chapter 15: Authenticated Encryption
15.1 Preliminaries
15.2 Authenticated encryption – generic composition
15.3 Security of generic composition
15.4 Authenticated ciphers
15.5 Counter with cipher block chaining message authentication code (CCM)
15.6 AEAD in TLS 1.3
15.7 Summary
Chapter 16: The Galois Counter Mode
16.1 Preliminaries
16.2 GCM security
16.3 GCM performance
16.4 Summary
Chapter 17: TLS Record Protocol Revisited
17.1 TLS Record protocol
17.2 TLS record layer
17.3 TLS record payload protection
17.4 Per-record nonce
17.5 Record padding
17.6 Limits on key usage
17.7 An experiment with the OpenSSL s_client
17.8 Summary
Chapter 18: TLS Cipher Suites
18.1 Symmetric cipher suites in TLS 1.3
18.2 Long-term security
18.3 ChaCha20
18.4 Poly1305
18.5 ChaCha20-Poly1305 AEAD construction
18.6 Mandatory-to-implement cipher suites
18.7 Summary
Part IV Bleeding Hearts and Biting Poodles
Chapter 19: Attacks on Cryptography
19.1 Preliminary remarks
19.2 Passive versus active attacks
19.3 Local versus remote attacks
19.4 Interactive versus non-interactive attacks
19.5 Attacks on cryptographic protocols
19.6 Attacks on encryption schemes
19.7 Attacks on hash functions
19.8 Summary
Chapter 20: Attacks on the TLS Handshake Protocol
20.1 Downgrade attacks
20.2 Logjam
20.3 SLOTH
20.4 Padding oracle attacks on TLS handshake
20.5 Bleichenbacher attack
20.6 Improvements of Bleichenbacher’s attack
20.7 Insecure renegotiation
20.8 Triple Handshake attack
20.9 Summary
Chapter 21: Attacks on the TLS Record Protocol
21.1 Lucky 13
21.2 POODLE
21.3 BEAST
21.4 Sweet32
21.5 Compression-based attacks
21.6 Summary
Chapter 22: Attacks on TLS Implementations
22.1 SMACK
22.2 FREAK
22.3 Truncation attacks
22.4 Heartbleed
22.5 Insecure encryption activation
22.6 Random number generation
22.7 BERserk attack
22.8 Cloudbleed
22.9 Timing attacks
22.10 Summary
Bibliography
Index
Other Books You Might Enjoy
Packt is searching for authors like you
Share Your Thoughts
Download a free PDF copy of this book
Title Page
Cover
Table of Contents
TLS Cryptography In-Depth
Explore the intricacies of modern cryptography and theinner workings of TLS
Dr. Paul Duplys
Dr. Roland Schmitz
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Pavan Ramchandani
Publishing Product Manager: Neha Sharma
Senior Editor: Arun Nadar
Technical Editor: Arjun Varma
Copy Editor: Safis Editing
Project Coordinator: Ashwini Gowda
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Vijay Kamble
Marketing Coordinator: Marylou De Mello
First published: December 2023
Production reference: 203012024
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square Birmingham B3 1RB, UK
ISBN 978-1-80461-195-1
www.packtpub.com
Dr. Paul Duplys is chief expert for cybersecurity at the department for technical strategies and enabling within the Mobility sector of Robert Bosch GmbH, a Tier-1 automotive supplier and manufacturer of industrial, residential, and consumer goods. Previous to this position, he spent over 12 years with Bosch Corporate Research, where he led the security and privacy research program and conducted applied research in various fields of information security. Paul’s research interests include security automation, software security, security economics, software engineering, and AI. Paul holds a PhD degree in computer science from the the University of Tübingen, Germany.
Dr. Roland Schmitz has been a professor of internet security at the Stuttgart Media University (HdM) since 2001. Prior to joining HdM, from 1995 to 2001, he worked as a research engineer at Deutsche Telekom, with a focus on mobile security and digital signature standardization. At HdM, Roland teaches courses on internet security, system security, security engineering, digital rights management, theoretical computer science, discrete mathematics, and game physics. He has published numerous scientific papers in the fields of internet and multimedia security. Moreover, he has authored and co-authored several books. Roland holds a PhD degree in mathematics from Technical University Braunschweig, Germany.
Writing this book has been an amazing journey, and it is our great pleasure to thank the people who have accompanied and supported us during this time: our project coordinators at Packt Publishing, Ashwini Gowda and Arun Nadar, and our meticulous technical reviewers, Andreas Bartelt, Simon Greiner, and Christos Grecos.
Andreas Bartelt holds a diploma in computer science (bioinformatics) and has specialized in cybersecurity topics for more than 20 years. He works at Robert Bosch GmbH as an expert in the field of cryptography and secures the deployment of cryptographic protocols such as TLS, IPsec, and SSH. Andreas is also a BSD enthusiast and has acquired extensive practical experience in securing POSIX-based operating systems (e.g., Linux) as well as their secure integration with hypervisors (e.g., Xen).
Dr. Christos Grecos (SM IEEE 2006, FSPIE 2023) is the chair and professor of the CS department at Arkansas State University. He was vice dean of PG research at NCI Ireland, chair of the CS department at Central Washington University (US), and dean of FCIT at Sohar University (Oman). He also has 13 years of experience in the UK as a professor, head of school, and associate dean for research. His research interests include image/video compression standards, processing and analysis, networking, and computer vision. He is on the editorial board of many international journals and has been invited to give talks at various international conferences. He has obtained significant funding for his research from several agencies, such as the UK EPSRC, UK TSB, the EU, and the Irish HEC.
Simon Greiner has been working as an automotive security expert for Robert Bosch GmbH, one of the largest automotive suppliers, for more than five years. He works in the lead engineering team for security and supports projects on different topics regarding security engineering, such as threat and risk analysis, security concepts, testing, and implementation security. He also supports pre-development projects on security topics, mainly in the context of autonomous driving. Before joining Robert Bosch GmbH, Simon obtained his PhD in computer science with a specialization in information security from the Karlsruhe Institute of Technology, Germany.
In this part, we set the scene for the Transport Layer Security (TLS) protocol. After discussing the history of the internet and TLS, we introduce the three basic security services provided by TLS, namely, confidentiality, integrity and authenticity, and give a first, high-level overview of TLS.
More specifically, we look at the role of cryptography in the modern connected world and highlight the reasons why Secure Sockets Layer (SSL), a predecessor of TLS, was invented in the early 1990s. Next, we explain why connectivity and complexity are the main drivers of cybersecurity and, in turn, cryptography in the modern connected world. We then introduce two cryptographic concepts: the secure channel and the CIAtriad. We then show what cryptographic keys are, how the confidentiality of information transmitted between two parties can be protected using encryption and decryption, and how these parties can ensure that they are actually talking to each other rather than to an attacker. Finally, we give a high-level overview of the TLS protocol to illustrate how the theoretical concepts of cryptography are applied in TLS.
This part contains the following chapters:
Chapter 1, The Role of Cryptography in the Connected World
Chapter 2, Secure Channel and the CIA Triad
Chapter 3, A Secret to Share
Chapter 4, Encryption and Decryption
Chapter 5, Entity Authentication
Chapter 6, Transport Layer Security at a Glance
In this introductory chapter, we try to provide some answers to the following questions:
Why are there so many insecure IT systems?
How can cryptography help to mitigate our security problems?
Our core argument is that the simultaneous growth of connectivity and complexity of IT systems has led to an explosion of the attack surface, and that modern cryptography plays an important role in reducing that attack surface.
After a brief discussion of how the field of cryptography evolved from an exotic field appreciated by a select few to an absolutely critical skill for the design and operation of nearly every modern IT product, we will look at some recent real-world security incidents and attacks in order to illustrate our claim. This will allow you to understand why cryptography matters from a higher strategic perspective.
Over the past four decades or so, cryptography has evolved from an exotic field known to a select few into a fundamental skill for the design and operation of modern IT systems. Today, nearly every modern product, from the bank card in your pocket to the server farm running your favorite cloud services, requires some form of cryptography to protect it and its users against cyberattacks. Consequently, it has found its way into mainstream computer science and software engineering.
Figure 1.1: Number of publications at IACR conferences on cryptology over the years
Cryptography and its counterpart cryptanalysis were basically unknown outside of military and intelligence services until the mid 1970s. According to [172], Cryptography is the practice and study of techniques for secure communication inthe presence of adversaries; it deals with the development and application of cryptographic mechanisms. Cryptanalysis is the study of cryptographic mechanisms’ weaknesses, aimed at finding mathematical ways to render these mechanisms ineffective. Taken together, cryptography and cryptanalysis form what’s called cryptology.
In 1967, David Kahn, an American historian, journalist, and writer, published a book titled The Codebreakers – The Story of Secret Writing, which is considered to be the first extensive treatment and a comprehensive report of the history of cryptography and military intelligence from ancient Egypt to modern times [93]. Kahn’s book introduced cryptology to a broader audience. Its content was, however, necessarily restricted to symmetric cryptography. In symmetric cryptography, the sender and receiver of a message share a common secret key and use it for both encrypting and decrypting. The problem of how sender and receiver should exchange the secret in a secure way was considered out of scope.
This changed in 1976, when the seminal paper New Directions in Cryptography by Whitfield Diffie and Martin Hellman appeared in volume IT-22 of IEEE Transactions on Information Security [49]. In that publication, Diffie and Hellman described a novel method for securely agreeing on a secret key over a public channel based on the so-called discrete logarithm problem. Moreover, they suggested for the first time that the sender and receiver might use different keys for encrypting (the public key) and decrypting (the private key) and thereby invented the field of asymmetric cryptography.
Figure 1.2: From left to right: Ralph Merkle, Martin Hellman, Whitfield Diffie [69]
While there were scientific works on cryptography dating back to the early 1970s, the publication by Diffie and Hellman is the first publicly available paper in which the use of a private key and a corresponding public key is proposed. This paper is considered to be the start of cryptography in the public domain. In 2002, Diffie and Hellman suggested their algorithm should be called Diffie-Hellman-Merkle key exchange because of Ralph Merkle’s significant contribution to the invention of asymmetric cryptography [185].
In 1977, the three MIT mathematicians Ron Rivest, Adi Shamir, and Len Adleman took up the suggestion by Diffie and Hellman and published the first asymmetric encryption algorithm, the RSA algorithm [151], which is based on yet another well-known mathematical problem, the factoring problem for large integers.
Figure 1.3: From left to right: Adi Shamir, Ron Rivest, Len Adleman [152]
The invention of asymmetric cryptography did not make symmetric cryptography obsolete. On the contrary, both fields have complementary strengths and weaknesses and can be efficiently combined in what is today called hybridcryptosystems. The Transport Layer Security (TLS) protocol is a very good example of a hybrid cryptosystem.
Today, cryptography is a well-known (albeit mostly little understood in depth) topic in the IT community and an integral part of software development. As an example, as of July 2022, the OpenSSL library repository on GitHub contains over 31,500 commits by 686 contributors. Cryptography is also an integral part of numerous computer science and information security curricula, and numerous universities all over the world offer degrees in information security.
Why did this happen, and which factors led to this development and popularized cryptography within a comparably short period of time? To a large extent, this paradigm shift is a result of three—arguably still ongoing—developments in information technology that radically changed the role of cryptography in the modern connected world:
The advent of the internet and the ever increasing need to transfer large amounts of data over untrusted channels, which also fostered the development of TLS
The introduction of connectivity into nearly every new product, from toothbrushes to automobiles
The ever increasing complexity of IT systems, specifically increasing hardware and software complexity
We will now discuss each of these factors in turn.
We’ll now turn to the original theme of this book, TLS and the cryptographic tools it is made of. TLS is a protocol designed to protect data sent over the internet, so we’ll start with a brief look into the early history of the internet.
Despite its origins as a research project financed by the Defense AdvancedResearch Projects Agency (DARPA), the research agency of the Department of Defence of the United States, most of the main physical components of the internet, such as cables, routers, gateways, and so on, can be (and are) accessed by untrusted third parties. In the early days of the internet, this was not considered a problem, and very few (if any) security measures were introduced into TCP and IP, the internet’s main protocol workhorses, and none of them involved cryptography. However, with more and more people using the internet, and the ever increasing available bandwidth, more and more services kept appearing on the internet, and it was quickly realized that to do real business over the internet, a certain amount of trust was needed that sensitive data such as credit card numbers or passwords did not fall into the wrong hands. Cryptography provides the answer to this problem, because it can guarantee confidentiality (i.e., no one can read the data in transit) and authenticity (i.e., you can verify that you are talking to the right party). TLS and its predecessor SSL are the protocols that implement cryptography on the internet in a secure, usable way.
Starting in 1995, SSL was shipped together with Netscape Navigator to clients. While server-side adoption of SSL was slow at first, by the end of 2021, according to the Internet Security Research Group (ISRG), 83% of web pages loaded by Firefox globally used HTTPS, that is HTTP secured via TLS [87].
Figure 1.4: Percentage of web pages loaded by Firefox using HTTPS [88]
This is a huge success for TLS and the field of cryptography in general, but with it also comes a huge responsibility: we need to constantly monitor whether the algorithms, key lengths, modes of operations, and so on used within TLS are still secure. Moreover, we need to understand how secure algorithms work and how they can interact with each other in a secure way so that we can design secure alternatives if needed.
Maybe we should already stress at this early stage that TLS is not a remedy for all the problems mentioned here. TLS provides channel-based security, meaning that it can only protect data in transit between a client and a server. TLS is very successful in doing so, and how in detail TLS uses cryptography to achieve this goal is the main theme of this book. However, once the data leaves the secure channel, it is up to the endpoints (i.e., client and server) to protect it.
Moreover, cryptography by itself is useless in isolation. To have any practical effect, it has to be integrated into a much larger system. And to ensure that cryptography is effectively protecting that system, there must be no security holes left that would allow an attacker to circumvent its security.
There is a well-known saying among cybersecurity professionals that the security of a system is only as strong as its weakest link. Because there are so many ways to circumvent security – especially in complex systems – cryptography, or rather the cryptographic primitives a system uses, is rarely the weakest link in the chain.
There is, however, one important reason why cryptography is fundamental for the security of information systems, even if there are other security flaws and vulnerabilities. An attacker who is able to break cryptography cannot be detected because a cryptanalytic attack, that is, the breaking of a cryptographic protocol, mechanism or primitive, in most cases leaves no traces of the attack.
If the attacker’s goal is to read the communication, they can simply passively listen to the communication, record the messages and decrypt them later. If the attacker’s goal is to manipulate the target system, they can simply forge arbitrary messages and the system will never be able to distinguish these messages from benign ones sent by legitimate users.
While there are many other sources of insecurity (e.g., software bugs, hardware bugs, and social engineering), the first line of defense is arguably secure communication, which in itself requires a secure channel. And cryptography as a scientific discipline provides the building blocks, methods, protocols, and mechanisms needed to realise secure communication.
Connectivity allows designers to add novel, unique features to their products and enables new business models with huge revenue potential that simply would not exist without it.
At the same time, connectivity makes it much harder to build secure systems. Similar to Ferguson and Schneier’s argument on security implications of complexity, one can say that there are no connected systems that are secure. Why? Because connecting systems to large, open networks like the internet exposes them to remote attacks. Remote attacks – unlike attacks that require physical access – are much more compelling from the attacker’s perspective because they scale.
While connectivity enables a multitude of desired features, it also exposes products to remote attacks carried out via the internet. Attacks that require physical access to the target device can only be executed by a limited number of attackers who actually have access to that device, for example, employees of a company in the case of devices in a corporate network. In addition, the need for physical access generally limits the attacker’s window of opportunity.
Connectivity, in contrast, exposes electronic devices and IT systems to remote attacks, leading to a much higher number of potential attackers and threat actors. Moreover, remote attacks – unlike attacks that require physical access to the target – are much more compelling from the attacker’s perspective because they scale.
Another aspect that makes remote attacks practical (and, to a certain extent, rather easy) is the fact that the initial targets are almost always the network-facing interfaces of the devices, which are implemented in software. As we have seen, complex software is almost guaranteed to contain numerous implementation bugs, a number of which can be typically exploited to attack the system. Thus, the trend of increasing software and system complexity inadvertently facilitates remote attacks.
Remote attacks are easy to launch – and hard to defend against – because their marginal cost is essentially zero. After a newly discovered security vulnerability is initially translated into a reliably working exploit, the cost of replicating the attack an additional 10, 100, or 100,000 devices is essentially the same, namely close to zero.
This is because remote attacks are implemented purely in software, and reproducing software as well as accessing devices over public networks effectively costs close to nothing. So, while businesses need to operate large – and costly – internal security organizations to protect their infrastructure, services, and products against cybersecurity attacks, any script kiddie can try to launch a remote attack on a connected product, online service, or corporate infrastructure essentially for free.
To summarize, connectivity exposes devices and IT systems to remote attacks that target network-facing software (and, thus, directly benefit from the continuously increasing software complexity), are very cheap to launch, can be launched by a large number of threat actors, and have zero marginal cost.
In addition, there exists a market for zero-day exploits [190] that allows even script kiddies to launch highly sophisticated remote attacks that infest target systems with advanced malware able to open a remote shell and completely take over the infested device.
As a result, connectivity creates an attack surface that facilitates cybersecurity attacks that scale.
While it can be argued that the problem of increasing complexity is not directly mitigated by modern cryptography (in fact, many crypto-related products and standards suffer from this problem themselves), there is no doubt that increasing complexity is in fact a major cause of security problems. We included the complexity problem in our list of crucial factors for the development of cryptography, because cryptography can help limitthe damage caused by attacks that were in turn caused by excessive complexity.
Following Moore’s law [191], a prediction made by the co-founder of Fairchild Semiconductor and Intel Gordon Moore in 1965, the number of transistors in an integrated circuit, particularly in a microprocessor, kept doubling roughly every 2 years (see Figure 1.5).
Figure 1.5: Increasing complexity of hardware: Transistors. Data is taken from https://github.com/barentsen/tech-progress-data
Semiconductor manufacturers were able to build ever bigger and ever more complex hardware with ever more features. This went so far that in the late 1990s, the Semiconductor Industry Association set off an alarm in the industry when it warned that productivity gains in Integrated Circuit (IC) manufacturing were growing faster than the capabilities of Electronic DesignAutomation (EDA) tools used for IC design. Entire companies in the EDA area were successfully built on this premise.
Continuously growing hardware resources paved the way for ever more complex software with ever more functionality. Operating systems became ever more powerful and feature-rich, the number of layers in software stacks kept increasing, and software libraries and frameworks used by programmers became ever more comprehensive. As predicted by a series of software evolution laws formulated by early-day computer scientists Manny Lehman and Les Belady, software exhibited continuing growth and increasing complexity [181] (see also Figure 1.6).
Figure 1.6: Increasing complexity of software: Source Lines of Code (SLOC) in operating systems. Data is taken from https://github.com/barentsen/tech-progress-data
Why should increasing complexity be a problem? According to leading cybersecurity experts Bruce Schneier and Niels Ferguson [65], ”Complexity is theworst enemy of security, and it almost always comes in the form of features oroptions”.
While it might be argued whether complexity really is the worst enemy of security, it is certainly true that complex systems, whether realized in hardware or software, tend to be error-prone. Schneier and Ferguson even claim that thereare no complex systems that are secure.
Complexity negatively affects security in several ways, including the following:
Insufficient testability due to a combinatorial explosion given a large number of features
Unanticipated—and unwanted—behavior that emerges from a complex interplay of individual features
A high number of implementation bugs and, potentially, architectural flaws due to the sheer size of a system
The following thought experiment illustrates why complexity arising from the number of features or options is a major security risk. Imagine an IT system, say a small web server, whose configuration consists of 30 binary parameters (that is, each parameter has only two possible values, such as on or off). Such a system has more than a billion possible configurations. To guarantee that the system is secure under all configurations, its developers would need to write and run several billion tests: one test for each relevant type of attack (e.g., Denial-of-Service, cross-site scripting, and directory traversal) and each configuration. This is impossible in practice, especially because software changes over time, with new features being added and existing features being refactored. Moreover, real-world IT systems have significantly more than 30 binary parameters. As an example, the NGINX web server has nearly 800 directives for configuring how the NGINX worker processes handle connections.
A related phenomenon that creates security risks in complex systems is the unanticipated emergent behavior. Complex systems tend to have properties that their parts do not have on their own, that is, properties or behaviors that emerge only when the parts interact [186]. Prime examples for security vulnerabilities arising from emergent behavior are time-of-check-to-time-of-use (TOCTOU) attacks exploiting concurrency failures, replay attacks on cryptographic protocols where an attacker reuses an out-of-date message, and side-channel attacks exploiting unintended interplay between micro-architectural features for speculative execution.
Currently available software engineering processes, methods, and tools do not guarantee error-free software. Various studies on software quality indicate that, on average, 1,000 lines of source code contain 30-80 bugs [174]. In rare cases, examples of extensively tested software were reported that contain 0.5-3 bugs per 1,000 lines of code [125].
However, even a rate of 0.5-3 bugs per 1,000 lines of code is far from sufficient for most practical software systems. As an example, the Linux kernel 5.11, released in 2021, has around 30 million lines of code, roughly 14% of which are considered the ”core” part (arch, kernel, and mm directories). Consequently, even with extensive testing and validation, the Linux 5.11 core code alone would contain approximately 2,100-12,600 bugs.
And this is only the operating system core without any applications. As of July 2022, the popular Apache HTTP server consists of about 1.5 million lines of code. So, even assuming the low rate of 0.5-3 bugs per 1,000 lines of code, adding a web server to the core system would account for another 750-4,500 bugs.
Figure 1.7: Increase of Linux kernel size over the years
What is even more concerning is the rate of bugs doesn’t seem to improve significantly enough over time to cope with the increasing software size. The extensively tested software having 0.5-3 bugs per 1,000 lines of code mentioned above was reported by Myers in 1986 [125]. On the other hand, a study performed by Carnegie Mellon University’s CyLab institute in 2004 identified 0.17 bugs per 1,000 lines of code in the Linux 2.6 kernel, a total of 985 bugs, of which 627 were in critical parts of the kernel. This amounts to slightly more than halving the bug rate at best – over almost 20 years.
Clearly, in that same period of time from 1986 to 2004 the size of typical software has more than doubled. As an example, Linux version 1.0, released in 1994, had about 170,000 lines of code. In comparison, Linux kernel 2.6, which was released in 2003, already had 8.1 million lines of code. This is approximately a 47-fold increase in size within less than a decade.
Figure 1.8: Reported security vulnerabilities per year
The combination of these two trends – increase in complexity and increase in connectivity – results in an attack surface explosion. The following examples shall serve to illustrate this point.
In late 2016, the internet was hit by a series of massive DistributedDenial-of-Service (DDoS) attacks originating from the Mirai botnet, a large collection of infected devices (so-called bots) remote-controlled by attackers.
The early history of the Mirai botnet can be found in [9]: the first bootstrap scan on August 1 lasted about two hours and infected 834 devices. This initial population continued to scan for new members and within 20 hours, another 64,500 devices were added to the botnet. The infection campaign continued in September, when about 300,000 devices were infected, and reached its peak of 600,000 bots by the end of November. This corresponds to a rate of 2.2-3.4 infected devices per minute or 17.6-27.2 seconds to infect a single device.
Now contrast this with a side-channel or fault attack. Even if we assume that the actual attack – that is, the measurement and processing of the side-channel traces or the injection of a fault – can be carried out in zero time, an attacker would still need time to gain physical access to each target. Now suppose that, on average, the attacker needs one hour to physically access a target (actually, this is a very optimistic assumption from the attacker’s perspective, given that the targets are distributed throughout the globe). In that case, attacking 200,000-300,000 devices would take approximately 22-33 years or 270 to 400 months (as opposed to 2 months in the case of Mirai).
Moreover, any remote attack starts at a network interface of the target system. So the first (and, oftentimes, the only) thing the attacker interacts with is software. But software is complex by nature.
In mid-December 2009, Google discovered a highly sophisticated, targeted attack on their corporate infrastructure that resulted in intellectual property theft [73]. During their investigation, Google discovered that at least 20 other large companies from a wide range of businesses had been targeted in a similar way [193].
This series of cyberattacks came to be known as Operation Aurora [193] and were attributed to APT groups based in China. The name was coined by McAfee Labs security researchers based on their discovery that the word Aurora could be found in a file on the attacker’s machine that was later included in malicious binaries used in the attack as part of a file path. Typically, such a file path is inserted by the compiler into the binary to indicate where debug symbols and source code can be found on the developer’s machine. McAfee Labs therefore hypothesized that Aurora could be the name of the operation used by the attackers [179].
According to McAfee, the main target of the attack was source code repositories at high-tech, security, and defense contractor companies. If these repositories were modified in a malicious way, the attack could be spread further to their client companies. Operation Aurora can therefore be considered the first major attack on software supply chains [193].
In response to Aurora, Google shut down its operations in China four months after the incident and migrated away from a purely perimeter-based defenseprinciple. This means devices are not trusted by default anymore, even if they are located within a corporate LAN [198].
At the BlackHat 2015 conference, security researchers Charlie Miller and Chris Valasek demonstrated the first remote attack on an unaltered, factory passenger car [120]. In what later became known as the Jeep hack, the researchers demonstrated how the vehicle’s infotainment system, Uconnect, which has both remote connectivity as well as the capability to communicate with other electronic control units within the vehicle, can be used for remote attacks.
Specifically, while systematically examining the vehicle’s attack surface, the researchers discovered an open D-Bus over an IP port on Uconnect, which is essentially an inter-process communication and remote procedure call mechanism. The D-Bus service accessible via the open port allows anyone connected to the infotainment system to execute arbitrary code in an unauthenticated manner.
Miller and Valasek also discovered that the D-Bus port was bound to all network interfaces on the vulnerable Uconnect infotainment system and was therefore accessible remotely over the Sprint mobile network that Uconnect uses for telematics. By connecting to the Sprint network using a femtocell or simply a regular mobile phone, the researchers were able to send remote commands to the vehicle.
From that entry point, Miller and Valasek attacked a chip in the vehicle’s infotainment system by re-writing its firmware to be able to send arbitrary commands over the vehicle’s internal CAN communication network, effectively giving them the ability to completely take over the vehicle.
What do these examples have in common and how does it relate to cryptography? In a nutshell, these examples illustrate what happens in the absence of appropriate cryptography. In all three cases discussed, there was no mechanism in place to verify that the systems were talking to legitimate users and that the messages received were not manipulated while in transit.
In the Mirai example, anyone with knowledge of the IoT devices’ IP addresses would have been able to access their login page. This information can be easily collected by scanning the public internet with tools such as nmap. So the designers’ assumption that the users would change the default device password to a strong individual one was the only line of defense. What the security engineers should have done instead is to add a cryptographic mechanism to give access to the login procedure only to legitimate users, for example, users in possession of a digital certificate or a private key.
