41,99 €
Are you a software developer, systems designer, or computer architecture student looking for a methodical introduction to digital device architectures, but are overwhelmed by the complexity of modern systems? This step-by-step guide will teach you how modern computer systems work with the help of practical examples and exercises. You’ll gain insights into the internal behavior of processors down to the circuit level and will understand how the hardware executes code developed in high-level languages.
This book will teach you the fundamentals of computer systems including transistors, logic gates, sequential logic, and instruction pipelines. You will learn details of modern processor architectures and instruction sets including x86, x64, ARM, and RISC-V. You will see how to implement a RISC-V processor in a low-cost FPGA board and write a quantum computing program and run it on an actual quantum computer.
This edition has been updated to cover the architecture and design principles underlying the important domains of cybersecurity, blockchain and bitcoin mining, and self-driving vehicles.
By the end of this book, you will have a thorough understanding of modern processors and computer architecture and the future directions these technologies are likely to take.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 991
Veröffentlichungsjahr: 2022
Modern Computer Architecture and Organization
Second Edition
Learn x86, ARM, and RISC-V architectures and the design of smartphones, PCs, and cloud servers
Jim Ledin
BIRMINGHAM—MUMBAI
Modern Computer Architecture and Organization
Second Edition
Copyright © 2022 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Senior Publishing Product Manager: Denim Pinto
Acquisition Editor – Peer Reviews: Saby Dsilva
Project Editor: Namrata Katare
Content Development Editor: Edward Doxey
Copy Editor: Safis Editing
Technical Editor: Tejas Mhasvekar
Proofreader: Safis Editing
Indexer: Subalakshmi Govindhan
Presentation Designer: Ganesh Bhadwalkar
First published: April 2020
Second edition: May 2022
Production reference: 2260422
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80323-451-9
www.packt.com
I am a software developer, not a hardware engineer. I have spent my career building software of all different kinds to solve lots of different kinds of problems. However, as a quirk, accident or fate, I have spent a fair amount of my software development career closer to the hardware than many, maybe most, software developers do these days.
In the early years of my fascination with computers I quickly discovered that the, by today’s standards, incredibly crude devices that I had access to, couldn’t really do anything very interesting unless I learned how to program them in assembler. So, I learned to program them in Z80, and later 6502 and 80x86 assembler.
Programming in assembler is different in lots of ways to programming in higher level languages. It immediately puts you next to the hardware. You can’t ignore how memory is laid out, you need to adjust your code for it. You can’t ignore the registers at your disposal, they are your variables and you need to marshal them carefully. You also learn how to communicate with other devices through I/O ports, which is, ultimately, the only way that digital devices communicate with each other. Once, when working on a particularly tricky problem, I woke up in the middle of the night and realised that I had been dreaming in 80x86 assembly language.
My career, and more importantly the hardware I had access to, developed. I got my dream job, at the time, working in the R&D division of a computer manufacturer. I worked on enhancing operating systems to work with our hardware and built device drivers to take advantage of some of the unique features of our PCs. Again, it was essential in this kind of work to have a good working knowledge of how the hardware worked.
Software development evolved. The languages that we used became more abstract, the operating systems, virtual machines, containers and public cloud infrastructure increasingly hid the details of the underlying hardware from us software developers. I recently spoke to a LISP programmer on social media who didn’t realise that ultimately his lovely functional declarative structures got translated to opcodes and values in the registers of a CPU. He seemingly had no working model for how the computers that he relied upon worked. He didn’t have to, but I think he would be a better programmer if he did.
In the latter part of my career I worked on some world-class high performance systems. A team I led was tasked with building one the world’s highest performance financial exchanges.
In order to do so, once again we needed to dig in and really understand how the underlying hardware of our system worked. This allowed us to take full advantage of the staggering levels of performance that modern hardware is capable of. During this time we stole a term from motor-racing to try to describe our approach. In the 1970’s the best Formula 1 racing driver was Jackie Stewart. He was interviewed and asked, “do you need to be an engineer to be a great racing driver?”. Jackie responded, “no, but you must have Mechanical Sympathy for the car.” In essence, you need to understand the capabilities of the underlying hardware to take advantage of them.
We adopted this idea of Mechanical Sympathy and applied it to our work. For example, the biggest cost in our trading system was a cache-miss. If the data we wanted to process wasn’t in the appropriate cache when it was needed, we’d see orders of magnitude wiped off the performance of our system. So we needed to design our code, even though it was written in a high-level language running in a virtual machine, to maximise our chances that the data was in the cache. We needed to understand and manage the concurrency in our multi-core processors, and recognise and take advantage of things like processor cache lines and the essentially block storage nature of memory and other storage devices. The result was levels of performance that some people didn’t think possible. Modern hardware is very impressive when you take advantage of it.
This interest in the hardware isn’t just for high-performance computing. Estimates vary, but all agree that a significant fraction of the carbon that we emit as a species comes from powering the data-centres where our code lives. I can’t think of any field of human endeavor that is as inefficient as software—for most systems, a speed increase of up to 100 times is easy if you do just a bit more work to manage the flow of information through your hardware. Nearly all systems can attain a 1000-fold increase with some more focused work, however if we could gain even a 10x improvement by better understanding how our code works and how it uses the hardware that it operates on, we could reduce the carbon footprint of computing by a factor of 10 too. That is an idea that is much more important than performance for performance’s sake.
Ultimately, there is a degree to which you must understand how your computer works, and there are risks to losing touch with how the hardware we all depend upon functions. I confess that I am a nerd. I love to understand how things work. Not everyone needs to push hardware to its limits, but it is a bad idea to treat it like magic, because it’s not magic. It is engineering, and engineering is always about trade-offs. You will be surprised how often the fundamentals of how your hardware works leaks out and affects the behaviour of your software, however far up the stack it sits—even if we are writing cloud-based systems in LISP.
For someone like me, Jim Ledin’s Modern Computer Architecture and Organization, Second Edition, is a delight.
I am not a hardware engineer, and I don’t want to be. For me though, a vital part of my skills as a software developer includes having a good working model for how the hardware that I rely upon, actually works. I want to maintain, and build, mechanical sympathy.
This book takes us from the basic concepts of computation, looking at the first computers and the first CPUs, to the potential of quantum computing and other near-future directions that our hardware will probably exploit. You might want to understand how a modern processor works, and get to grips with their staggering efficiency and their ability to keep themselves fed with data from stores that are hundreds of times slower than they are. You may also be interested in complex ideas that extend beyond the confines of only the hardware, such as how cryptocurrency mining works, or what the architecture of a modern self-driving car, looks like. This book can answer those questions and many, many more.
I think that it is not just computer scientists and engineers, but indeed every software developer, who will be better at their job when they have some understanding of how the devices that they use in their everyday work. When trying to understand something big and complicated in software, I still, frequently think, “well it’s all just bits, bytes and opcodes really, so what is going on here?” This is the equivalent of a chemist understanding molecules and compounds and being able to go back to first principles to solve something tricky. These are the real building blocks, and it can help us all to understand them better.
I know that I will be dipping into this book on a regular basis for years to come, and I hope that you enjoy doing the same.
Dave FarleyIndependent Software Engineering Consultant and Founder of Continuous Delivery Ltd
Jim Ledin is the CEO of Ledin Engineering, Inc. Jim is an expert in embedded software and hardware design and testing. He is also an expert in system cybersecurity assessment and penetration testing. He has a B.S. degree in aerospace engineering from Iowa State University and an M.S. degree in electrical and computer engineering from the Georgia Institute of Technology. Jim is a registered professional electrical engineer in California, a Certified Information System Security Professional (CISSP), a Certified Ethical Hacker (CEH), and a Certified Penetration Tester (CPT).
I would like to thank my wife Lynda and daughter Emily for their patience and support as I focused on this project. I love you, my sweeties!
I would also like to thank Dr. Sarah M. Neuwirth and Iztok Jeras for their diligent work reviewing each of these chapters. Your input has helped create a much better book!
Thank you as well to Dave Farley for providing such an eloquent foreword.
Dr. Sarah M. Neuwirth is a postdoctoral research associate in the Modular Supercomputing and Quantum Computing Group at Goethe-University Frankfurt, Germany. She also holds a visiting researcher position at the Jülich Supercomputing Centre, Germany. Sarah has more than 9 years of experience in the academic field. Her research interests include high-performance storage systems, parallel I/O and file systems, modular supercomputing (i.e., resource disaggregation and virtualization), standardized cluster benchmarking, high performance computing and networking, distributed systems, and communication protocols.
In addition, Sarah has designed the curriculum for, and taught, courses in parallel computer architecture, high performance interconnection networks, and distributed systems for the past 9 years. In 2018, Sarah was awarded her Ph.D. in computer science from the Heidelberg University, Germany. She defended her degree with highest honors (summa cum laude) and was the recipient of the ZONTA Science Award 2019 for her outstanding dissertation. Sarah also holds a M.Sc. degree (2012) and B.Sc. degree (2010) in computer science and mathematics from the University of Mannheim, Germany. She has served as a technical reviewer for several prestigious HPC-related conferences and journals, including the IEEE/ACM SC Conference Series, ACM ICPP, IEEE IDPDS, IEEE HPCC, the PDSW workshop at IEEE/ACM SC, the PERMAVOST workshop at ACM HPDC, ACM TOCS, IEEE Access, and Elsevier’s FGCS.
Iztok Jeras obtained a bachelor’s degree in electrical engineering and a master’s in computer science at the University of Ljubljana. He worked at several Slovenian companies on micro-controller, FPGA, and ASIC designs with some embedded software and Linux IT work mixed in. In his spare time, he researches cellular automata and contributes to digital design-related open source projects. Recently he has been focusing on the RISC-V ISA.
Join the book’s Discord workspace for a monthly Ask me Anything session with the author: https://discord.gg/7h8aNRhRuY
Preface
Who this book is for
What this book covers
To get the most out of this book
Get in touch
Introducing Computer Architecture
Technical requirements
The evolution of automated computing devices
Charles Babbage’s Analytical Engine
ENIAC
IBM PC
The Intel 8088 microprocessor
The Intel 80286 and 80386 microprocessors
The iPhone
Moore’s law
Computer architecture
Representing numbers with voltage levels
Binary and hexadecimal numbers
The 6502 microprocessor
The 6502 instruction set
Summary
Exercises
Digital Logic
Technical requirements
Electrical circuits
The transistor
Logic gates
Latches
Flip-flops
Registers
Adders
Propagation delay
Clocking
Sequential logic
Hardware description languages
VHDL
Summary
Exercises
Processor Elements
Technical requirements
A simple processor
Control unit
Executing an instruction – a simple example
Arithmetic logic unit
Registers
The instruction set
Addressing modes
Immediate addressing mode
Absolute addressing mode
Absolute indexed addressing mode
Indirect indexed addressing mode
Instruction categories
Memory load and store instructions
Register-to-register data transfer instructions
Stack instructions
Arithmetic instructions
Logical instructions
Branching instructions
Subroutine call and return instructions
Processor flag instructions
Interrupt-related instructions
No operation instruction
Interrupt processing
processing
processing
BRK instruction processing
Input/output operations
Programmed I/O
Interrupt-driven I/O
Direct memory access
Summary
Exercises
Computer System Components
Technical requirements
Memory subsystem
Introducing the MOSFET
Constructing DRAM circuits with MOSFETs
The capacitor
The DRAM bit cell
DDR5 SDRAM
Graphics DDR
Prefetching
I/O subsystem
Parallel and serial data buses
PCI Express
SATA
M.2
USB
Thunderbolt
Graphics displays
VGA
DVI
HDMI
DisplayPort
Network interface
Ethernet
Wi-Fi
Keyboard and mouse
Keyboard
Mouse
Modern computer system specifications
Summary
Exercises
Hardware-Software Interface
Technical requirements
Device drivers
The parallel port
PCIe device drivers
Device driver structure
BIOS
UEFI
The boot process
BIOS boot
UEFI boot
Trusted boot
Embedded devices
Operating systems
Processes and threads
Scheduling algorithms and process priority
Multiprocessing
Summary
Exercises
Specialized Computing Domains
Technical requirements
Real-time computing
Real-time operating systems
Digital signal processing
ADCs and DACs
DSP hardware features
Signal processing algorithms
Convolution
Digital filtering
Fast Fourier transform (FFT)
GPU processing
GPUs as data processors
Big data
Deep learning
Examples of specialized architectures
Summary
Exercises
Processor and Memory Architectures
Technical requirements
The von Neumann, Harvard, and modified Harvard architectures
The von Neumann architecture
The Harvard architecture
The modified Harvard architecture
Physical and virtual memory
Paged virtual memory
Page status bits
Memory pools
Memory management unit
Summary
Exercises
Performance-Enhancing Techniques
Technical requirements
Cache memory
Multilevel processor caches
Static RAM
Level 1 cache
Direct-mapped cache
Set associative cache
Processor cache write policies
Level 2 and level 3 processor caches
Instruction pipelining
Superpipelining
Pipeline hazards
Micro-operations and register renaming
Conditional branches
Simultaneous multithreading
SIMD processing
Summary
Exercises
Specialized Processor Extensions
Technical requirements
Privileged processor modes
Handling interrupts and exceptions
Protection rings
Supervisor mode and user mode
System calls
Floating-point arithmetic
The 8087 floating-point coprocessor
The IEEE 754 floating-point standard
Power management
Dynamic voltage frequency scaling
System security management
Trusted Platform Module
Thwarting cyberattackers
Summary
Exercises
Modern Processor Architectures and Instruction Sets
Technical requirements
x86 architecture and instruction set
The x86 register set
x86 addressing modes
Implied addressing
Register addressing
Immediate addressing
Direct memory addressing
Register indirect addressing
Indexed addressing
Based indexed addressing
Based indexed addressing with scaling
x86 instruction categories
Data movement
Stack manipulation
Arithmetic and logic
Conversions
Control flow
String manipulation
Flag manipulation
Input/output
Protected mode
Miscellaneous instructions
Other instruction categories
Common instruction patterns
x86 instruction formats
x86 assembly language
x64 architecture and instruction set
The x64 register set
x64 instruction categories and formats
x64 assembly language
32-bit ARM architecture and instruction set
The ARM register set
ARM addressing modes
Immediate
Register direct
Register indirect
Register indirect with offset
Register indirect with offset, pre-incremented
Register indirect with offset, post-incremented
Double register indirect
Double register indirect with scaling
ARM instruction categories
Load/store
Stack manipulation
Register movement
Arithmetic and logic
Comparisons
Control flow
Supervisor mode
Breakpoint
Conditional execution
Other instruction categories
32-bit ARM assembly language
64-bit ARM architecture and instruction set
64-bit ARM assembly language
Summary
Exercises
The RISC-V Architecture and Instruction Set
Technical requirements
The RISC-V architecture and applications
The RISC-V base instruction set
Computational instructions
Control flow instructions
Memory access instructions
System instructions
Pseudo-instructions
Privilege levels
RISC-V extensions
The M extension
The A extension
The C extension
The F and D extensions
Other extensions
RISC-V variants
64-bit RISC-V
Standard RISC-V configurations
RISC-V assembly language
Implementing RISC-V in an FPGA
Summary
Exercises
Processor Virtualization
Technical requirements
Introducing virtualization
Types of virtualization
Operating system virtualization
Application virtualization
Network virtualization
Storage virtualization
Categories of processor virtualization
Trap-and-emulate virtualization
Paravirtualization
Binary translation
Hardware emulation
Virtualization challenges
Unsafe instructions
Shadow page tables
Security
Virtualizing modern processors
x86 processor virtualization
x86 hardware virtualization
ARM processor virtualization
RISC-V processor virtualization
Virtualization tools
VirtualBox
VMware Workstation
VMware ESXi
KVM
Xen
QEMU
Virtualization and cloud computing
Electrical power consumption
Summary
Exercises
Domain-Specific Computer Architectures
Technical requirements
Architecting computer systems to meet unique requirements
Smartphone architecture
iPhone 13 Pro Max
Personal computer architecture
Alienware Aurora Ryzen Edition R10 gaming desktop
Ryzen 9 5950X branch prediction
Nvidia GeForce RTX 3090 GPU
Aurora subsystems
Warehouse-scale computing architecture
WSC hardware
Rack-based servers
Hardware fault management
Electrical power consumption
The WSC as a multilevel information cache
Deploying a cloud application
Neural networks and machine learning architectures
Intel Nervana neural network processor
Summary
Exercises
Cybersecurity and Confidential Computing Architectures
Technical requirements
Cybersecurity threats
Cybersecurity threat categories
Cyberattack techniques
Types of malware
Post-exploitation actions
Features of secure hardware
Identify what needs to be protected
Anticipate all types of attacks
Features of secure system design
Secure key storage
Encryption of data at rest
Encryption of data in transit
Cryptographically secure key generation
Secure boot procedure
Tamper-resistant hardware design
Confidential computing
Designing for security at the architectural level
Avoid security through obscurity
Comprehensive secure design
The principle of least privilege
Zero trust architecture
Ensuring security in system and application software
Common software weaknesses
Buffer overflow
Cross-site scripting
SQL injection
Path traversal
Source code security scans
Summary
Exercises
Blockchain and Bitcoin Mining Architectures
Technical requirements
Introduction to blockchain and bitcoin
The SHA-256 hash algorithm
Computing SHA-256
Bitcoin core software
The bitcoin mining process
Bitcoin mining pools
Mining with a CPU
Mining with a GPU
Bitcoin mining computer architectures
Mining with FPGAs
Mining with ASICs
Bitcoin mining economics
Alternative types of cryptocurrency
Summary
Exercises
Self-Driving Vehicle Architectures
Technical requirements
Overview of self-driving vehicles
Driving autonomy levels
Safety concerns of self-driving vehicles
Hardware and software requirements for self-driving vehicles
Sensing vehicle state and the surroundings
GPS, speedometer, and inertial sensors
Video cameras
Radar
Lidar
Sonar
Perceiving the environment
Convolutional neural networks
Example CNN implementation
CNNs in autonomous driving applications
Lidar localization
Object tracking
Decision processing
Lane keeping
Complying with the rules of the road
Avoiding objects
Planning the vehicle path
Autonomous vehicle computing architecture
Tesla HW3 Autopilot
Summary
Exercises
Quantum Computing and Other Future Directions in Computer Architectures
Technical requirements
The ongoing evolution of computer architectures
Extrapolating from current trends
Moore’s law revisited
The third dimension
Increased device specialization
Potentially disruptive technologies
Quantum physics
Spintronics
Quantum computing
Quantum code-breaking
Adiabatic quantum computation
The future of quantum computing
Carbon nanotubes
Building a future-tolerant skill set
Continuous learning
College education
Conferences and literature
Summary
Exercises
Appendix
Answers to Exercises
Chapter 1: Introducing Computer Architecture
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Exercise 5
Answer
Exercise 6
Answer
Chapter 2: Digital Logic
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Exercise 5
Answer
Exercise 6
Answer
Chapter 3: Processor Elements
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Exercise 5
Answer
Exercise 6
Answer
Chapter 4: Computer System Components
Exercise 1
Answer
Exercise 2
Answer
Chapter 5: Hardware-Software Interface
Exercise 1
Answer
Exercise 2
Answer
Chapter 6: Specialized Computing Domains
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 7: Processor and Memory Architectures
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 8: Performance-Enhancing Techniques
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 9: Specialized Processor Extensions
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Exercise 5
Answer
Exercise 6
Answer
Exercise 7
Answer
Exercise 8
Answer
Chapter 10: Modern Processor Architectures and Instruction Sets
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Exercise 5
Answer
Exercise 6
Answer
Exercise 7
Answer
Exercise 8
Answer
Chapter 11: The RISC-V Architecture and Instruction Set
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 12: Processor Virtualization
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 13: Domain-Specific Computer Architectures
Exercise 1
Answer
Exercise 2
Answer
Chapter 14: Cybersecurity and Confidential Computing Architectures
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Chapter 15: Blockchain and Bitcoin Mining Architectures
Exercise 1
Answer
Exercise 2
Answer
Chapter 16: Self-Driving Vehicle Architectures
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Chapter 17: Future Directions in Computer Architectures
Exercise 1
Answer
Exercise 2
Answer
Exercise 3
Answer
Exercise 4
Answer
Other Books You May Enjoy
Index
Cover
Index
Welcome to the second edition of Modern Computer Architecture and Organization. It has been my pleasure to receive a great deal of feedback and comments from readers of the first edition. Of course, I appreciate all input from readers, especially those who bring any errors and omissions to my attention.
This book presents the key technologies and components employed in modern processor and computer architectures and discusses how various architectural decisions result in computer configurations optimized for specific needs.
To understate the situation quite drastically, modern computers are complicated devices. Yet, when viewed in a hierarchical manner, the functions of each level of complexity become clear. We will cover a great many topics in these chapters and will only have space to explore each of them to a limited degree. My goal is to provide a coherent introduction to each important technology and subsystem you might find in a modern computing device and explain its relationship to other system components.
This edition includes updates on technologies that have advanced since the publication of the first edition and adds significant new content in several important areas related to computer architecture. New chapters cover the topics of cybersecurity, blockchain and bitcoin mining, and self-driving vehicle computing architectures.
While the security of computing systems has always been important, recent exploitations of major vulnerabilities in widely used operating systems and applications have resulted in substantial negative impacts felt in countries around the world. These cyberattacks have accentuated the need for computer system designers to incorporate cybersecurity as a foundational element of system architecture.
I will not be providing a lengthy list of references for further reading. The internet is your friend in this regard.
If you can manage to bypass the clamor of political and social media argumentation on the internet, you will find yourself in an enormous, cool, quiet library containing a vast quantity of accumulated human knowledge. Learn to use the advanced features of your favorite search engine. Also, learn to differentiate high-quality information from uninformed opinion. Check multiple sources if you have any doubts about the information you’re finding. Consider the source: if you are looking for information about an Intel processor, search for documentation published by Intel.
By the end of this book, you will have gained a strong grasp of the computer architectures currently used in a wide variety of digital systems. You will also have developed an understanding of the relevant trends in architectural technology currently underway, as well as some possible disruptive advances in the coming years that may drastically influence the architectural development of computing systems.
This book is intended for software developers, computer engineering students, system designers, computer science professionals, reverse engineers, and anyone else seeking to understand the architecture and design principles underlying all types of modern computer systems, from tiny embedded devices to smartphones to warehouse-sized cloud server farms. Readers will also explore the directions these technologies are likely to take in the coming years. A general understanding of computer processors is helpful but is not required.
Chapter 1, Introducing Computer Architecture, begins with a brief history of automated computing devices and describes the significant technological advances that drove leaps in capability. This is followed by a discussion of Moore’s law, with an assessment of its applicability over previous decades and the implications for the future. The basic concepts of computer architecture are introduced in the context of the 6502 microprocessor.
Chapter 2, Digital Logic, introduces transistors as switching elements and explains their use in constructing logic gates. We will then see how flip-flops and registers are developed by combining simple gates. The concept of sequential logic, meaning logic that contains state information, is introduced, and the chapter ends with a discussion of clocked digital circuits.
Chapter 3, Processor Elements, begins with a conceptual description of a generic processor. We will examine the concepts of the instruction set, register set, and instruction loading, decoding, execution, and sequencing.
Memory load and store operations are also discussed. The chapter includes a description of branching instructions and their use in looping and conditional processing. Some practical considerations are introduced that lead to the necessity for interrupt processing and I/O operations.
Chapter 4, Computer System Components, discusses computer memory and its interface to the processor, including multilevel caching. I/O requirements, including interrupt handling, buffering, and dedicated I/O processors, are described. We will discuss some specific requirements for I/O devices, including the keyboard and mouse, the video display, and the network interface. The chapter ends with descriptive examples of these components in modern computer applications, including smart mobile devices, personal computers, gaming systems, cloud servers, and dedicated machine learning systems.
Chapter 5, Hardware-Software Interface, discusses the implementation of the high-level services a computer operating system must provide, including disk I/O, network communications, and interactions with users. This chapter describes the software layers that implement these features, starting at the level of the processor instruction set and registers. Operating system functions, including booting, multiprocessing, and multithreading, are also described.
Chapter 6, Specialized Computing Domains, explores domains of computing that tend to be less directly visible to most users, including real-time systems, digital signal processing, and GPU processing. We will discuss the unique requirements associated with each of these domains and look at examples of modern devices implementing these features.
Chapter 7, Processor and Memory Architectures, takes an in-depth look at modern processor architectures, including the von Neumann, Harvard, and modified Harvard variants. The chapter discusses the implementation of paged virtual memory. The practical implementation of memory management functionality within the computer architecture is introduced and the functions of the memory management unit are described.
Chapter 8, Performance-Enhancing Techniques, discusses a number of performance-enhancing techniques used routinely to reach peak execution speed in real-world computer systems. The most important techniques for improving system performance, including the use of cache memory, instruction pipelining, instruction parallelism, and SIMD processing, are the subjects of this chapter.
Chapter 9, Specialized Processor Extensions, focuses on extensions commonly implemented at the processor instruction set level to provide additional system capabilities beyond generic data processing requirements. The extensions presented include privileged processor modes, floating-point mathematics, power management, and system security management.
Chapter 10, Modern Processor Architectures and Instruction Sets, examines the architectures and instruction set features of modern processor designs, including the x86, x64, and ARM processors. One challenge that arises when producing a family of processors over several decades is the need to maintain backward compatibility with code written for earlier-generation processors. The need for legacy support tends to increase the complexity of the later-generation processors. This chapter will examine some of the attributes of these processor architectures that result from supporting legacy requirements.
Chapter 11, The RISC-V Architecture and Instruction Set, introduces the exciting new RISC-V (pronounced risk five) processor architecture and its instruction set. RISC-V is a completely open source, free-to-use specification for a reduced instruction set computer architecture. A complete instruction set specification has been released and a number of hardware implementations of this architecture are currently available. Work is ongoing to develop specifications for a number of instruction set extensions. This chapter covers the features and variants available in the RISC-V architecture and introduces the RISC-V instruction set. We will also discuss the applications of the RISC-V architecture in mobile devices, personal computers, and servers.
Chapter 12, Processor Virtualization, introduces the concepts involved in processor virtualization and explains the many benefits resulting from the use of virtualization. The chapter includes examples of virtualization based on open source tools and operating systems. These tools enable the execution of instruction set-accurate representations of various computer architectures and operating systems on a general-purpose computer. We will also discuss the benefits of virtualization in the development and deployment of real-world software applications.
Chapter 13, Domain-Specific Computer Architectures, brings together the topics discussed in previous chapters to develop an approach for architecting a computer system design to meet unique user requirements. We will discuss some specific application categories, including mobile devices, personal computers, gaming systems, internet search engines, and neural networks.
Chapter 14, Cybersecurity and Confidential Computing Architectures, focuses on the security needs of critical application areas like national security systems and financial transaction processing. These systems must be resilient against a broad range of cybersecurity threats, including malicious code, covert channel attacks, and attacks enabled by physical access to computing hardware. Topics addressed in this chapter include cybersecurity threats, encryption, digital signatures, and secure hardware and software design.
The explosion of interest in cryptocurrencies and their growing acceptance by mainstream financial institutions and retailers demonstrate that this area of computing is on a continued growth path. This edition adds a chapter on blockchain and the computational demands of bitcoin mining.
Chapter 15, Blockchain and Bitcoin Mining Architectures, introduces the concepts associated with blockchain, a public, cryptographically secured ledger recording a sequence of transactions. We continue with an overview of the process of bitcoin mining, which appends transactions to the bitcoin blockchain and rewards those who complete this task with payment in the form of bitcoin. Bitcoin processing requires high-performance computing hardware, which is illustrated in terms of a current-generation bitcoin mining computer architecture.
The continuing growth in the number of automobiles with partial or full self-driving capabilities demands robust, highly capable computing systems that meet the requirements for safe autonomous vehicle operation on public roadways.
Chapter 16, Self-Driving Vehicle Architectures, describes the capabilities required in self-navigating vehicle processing architectures. It begins with a discussion of the requirements for ensuring the safety of the autonomous vehicle and its occupants, as well as for other vehicles, pedestrians, and stationary objects. We continue with a discussion of the types of sensors and data a self-driving vehicle receives as input while driving and a description of the types of processing required for effective vehicle control. The chapter concludes with an overview of an example self-driving computer architecture.
Chapter 17, Quantum Computing and Other Future Directions in Computer Architectures, looks at the road ahead for computer architectures. This chapter reviews the significant advances and ongoing trends that have resulted in the current state of computer architectures and extrapolates these trends in possible future directions. Potentially disruptive technologies are discussed that could alter the path of future computer architectures. In closing, I will propose some approaches for professional development for the computer architect that should result in a future-tolerant skill set.
As in the other chapters, each of the three new chapters contains end-of-chapter exercises designed to broaden your understanding of the chapter topic and cement the information from the chapter within your knowledge base.
I hope you enjoy this updated edition as much as I have enjoyed developing it. Happy reading!
Each chapter in this book includes a set of exercises at the end. To get the most from the book, and to cement some of the more challenging concepts in your mind, I recommend you try to work through each exercise. Complete solutions to all exercises are provided in the book and are available online at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition.
In case there is a need to update the code examples and answers to the exercises, updates will appear at this GitHub repository.
The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781803234519_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Subtraction using the SBC instruction tends to be a bit more confusing to novice 6502 assembly language programmers.”
A block of code is set as follows:
; Add four bytes together using immediate addressing modeLDA#$04CLCADC#$03ADC#$02ADC#$01Any command-line input or output is written as follows:
C:\>bcdedit Windows Boot Manager -------------------- identifier {bootmgr}Bold: Indicates a new term, an important word, or words that you see on the screen, for example, in menus or dialog boxes, also appear in the text like this. For example: “Because there are now four sets, the Set field in the physical address reduces to two bits and the Tag field increases to 24 bits.”
Warnings or important notes appear like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome.
General feedback: Email [email protected], and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book we would be grateful if you would report this to us. Please visit, http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.
Once you’ve read Modern Computer Architecture and Organization, Second Edition, we’d love to hear your thoughts! Please clickheretogostraighttotheAmazonreviewpage for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
The architectures of automated computing systems have evolved from the first mechanical calculators constructed nearly two centuries ago to the broad array of modern electronic computer technologies we use directly and indirectly every day. Along the way, there have been stretches of incremental technological improvement interspersed with disruptive advances that drastically altered the trajectory of the industry. We can expect these trends to continue in the coming years.
In the 1980s, during the early days of personal computing, students and technical professionals eager to learn about computer technology had a limited range of subject matter available for this purpose. If they had a computer of their own, it was probably an IBM PC or an Apple II. If they worked for an organization with a computing facility, they might have used an IBM mainframe or a Digital Equipment Corporation VAX minicomputer. These examples, and a limited number of similar systems, encompassed most people’s exposure to the computer systems of the time.
Today, numerous specialized computing architectures exist to address widely varying user needs. We carry miniature computers in our pockets and purses that can place phone calls, record video, and function as full participants on the internet. Personal computers remain popular in a format outwardly similar to the PCs of past decades. Today’s PCs, however, are orders of magnitude more capable than the early generations in terms of computing power, memory size, disk space, graphics performance, and communication ability. These capabilities enable modern PCs to easily perform tasks that would have been inconceivable on early PCs, such as the real-time generation of high-resolution 3D images.
Companies offering web services to hundreds of millions of users construct vast warehouses filled with thousands of tightly coordinated computer systems capable of responding to a constant stream of user requests with extraordinary speed and precision. Machine learning systems are trained through the analysis of enormous quantities of data to perform complex activities such as driving automobiles.
This chapter begins with a presentation of some key historical computing devices and the leaps in technology associated with them. We will then examine some significant modern-day trends related to technological advances and introduce the basic concepts of computer architecture, including a close look at the 6502 microprocessor and its instruction set. The following topics will be covered in this chapter:
The evolution of automated computing devicesMoore’s lawComputer architectureFiles for this chapter, including answers to the exercises, are available at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition.
This section reviews some classic machines from the history of automated computing devices and focuses on the major advances each embodied. Babbage’s Analytical Engine is included here because of the many leaps of genius represented in its design. The other systems are discussed because they embodied significant technological advances and performed substantial real-world work over their lifetimes.
Although a working model of the Analytical Engine was never constructed, the detailed notes Charles Babbage developed from 1834 until his death in 1871 described a computing architecture that appeared to be both workable and complete. The Analytical Engine was intended to serve as a general-purpose programmable computing device. The design was entirely mechanical and was to be constructed largely of brass. The Analytical Engine was designed to be driven by a shaft powered by a steam engine.
Borrowing from the punched cards of the Jacquard loom, the rotating studded barrels used in music boxes, and the technology of his earlier Difference Engine (also never completed in his lifetime, and more of a specialized calculating device than a computer), the Analytical Engine’s design was, otherwise, Babbage’s original creation.
Unlike most modern computers, the Analytical Engine represented numbers in signed decimal form. The decision to use base-10 numbers rather than the base-2 logic of most modern computers was the result of a fundamental difference between mechanical technology and digital electronics. It is straightforward to construct mechanical wheels with 10 positions, so Babbage chose the human-compatible base-10 format because it was not significantly more technically challenging than using some other number base. Simple digital circuits, on the other hand, are not capable of maintaining 10 different states with the ease of a mechanical wheel.
All numbers in the Analytical Engine consisted of 40 decimal digits. The large number of digits was likely chosen to reduce problems with numerical overflow. The Analytical Engine did not support floating-point mathematics.
Each number was stored on a vertical axis containing 40 wheels, with each wheel capable of resting in 10 positions corresponding to the digits 0-9. A 41st number wheel contained the sign: any even number on this wheel represented a positive sign, and any odd number represented a negative sign. The Analytical Engine axis was somewhat analogous to the register used in modern processors, except the readout of an axis was destructive—reading an axis would set it to 0. If it was necessary to retain an axis’s value after it had been read, another axis had to store a copy of the value during the readout. Numbers were transferred from one axis to another, or used in computations, by engaging a gear with each digit wheel and rotating the wheel to extract the numerical value. The set of axes serving as system memory was referred to as the store.
The addition of two numbers used a process somewhat similar to the method of addition taught to schoolchildren. Assume a number stored on one axis, let’s call it the addend, was to be added to a number on another axis that we will call the accumulator. The machine would connect each addend digit wheel to the corresponding accumulator digit wheel through a train of gears. It would then simultaneously rotate each addend digit downward to 0 while driving the accumulator digit an equivalent rotation in the increasing direction. If an accumulator digit wrapped around from 9 to 0, the next most significant accumulator digit would increment by 1. This carry operation would propagate across as many digits as needed (think of adding 1 to 999,999). By the end of the process, the addend axis would hold the value 0 and the accumulator axis would hold the sum of the two numbers. The propagation of carries from one digit to the next was the most mechanically complex part of the addition process.
Operations in the Analytical Engine were sequenced by music box-like rotating barrels in a construct called the mill, which is analogous to the control unit of a modern CPU.
Each Analytical Engine instruction was encoded in a vertical row of locations on the barrel, where the presence or absence of a stud at a particular location either engaged a section of the Engine’s machinery or left the state of that section unchanged. Based on Babbage’s estimate of the Engine’s execution speed, the addition of two 40-digit numbers, including the propagation of carries, would take about 3 seconds.
Babbage conceived several important concepts for the Engine that remain relevant to modern computer systems. His design supported a degree of parallel processing consisting of simultaneous multiplication and addition operations that accelerated the computation of series of values intended to be output as numerical tables. Mathematical operations such as addition supported a form of pipelining, in which sequential operations on different data values overlapped in time.
Babbage was well aware of the difficulties associated with complex mechanical devices, such as friction, gear backlash, and wear over time. To prevent errors caused by these effects, the Engine incorporated mechanisms called lockings that were applied during data transfers across axes. The lockings forced the number wheels into valid positions and prevented the accumulation of small errors from allowing a wheel to drift to an incorrect value. The use of lockings is analogous to the amplification of potentially weak input signals to produce stronger outputs by the digital logic gates in modern processors.
The Analytical Engine was to be programmed using punched cards and supported branching operations and nested loops. The most complex program intended for execution on the Analytical Engine was developed by Ada Lovelace to compute the Bernoulli numbers, an important sequence in number theory. The Analytical Engine code to perform this computation is recognized as the first published computer program of substantial complexity.
Babbage constructed a trial model of a portion of the Analytical Engine mill, which is currently on display at the Science Museum in London.
ENIAC, the Electronic Numerical Integrator and Computer, was completed in 1945 and was the first programmable general-purpose electronic computer. The system consumed 150 kilowatts of electricity, occupied 1,800 square feet of floor space, and weighed 27 tons.
The design was based on vacuum tubes, diodes, and relays. ENIAC contained over 17,000 vacuum tubes that functioned as switching elements.
Similar to the Analytical Engine, it used base-10 representation of 10-digit decimal numbers implemented using 10-position ring counters (the ring counter will be discussed in Chapter 2, Digital Logic).
Input data was received from an IBM punch-card reader and the output of computations was delivered by a card punch machine.
The ENIAC architecture was capable of complex sequences of processing steps including loops, branches, and subroutines. The system had 20 10-digit accumulators that functioned like registers in modern computers. It did not initially have any memory beyond the accumulators. If intermediate values were required for use in later computations, the data had to be written to punch cards and read back in when needed. ENIAC could perform about 385 multiplications per second.
ENIAC programs consisted of plugboard wiring and switch-based function tables. Programming the system was an arduous process that often took the team of talented female programmers weeks to complete. Reliability was a problem, as vacuum tubes failed regularly, requiring troubleshooting on a day-to-day basis to isolate and replace failed tubes.
In 1948, ENIAC was improved by adding the ability to program the system via punch cards rather than plugboards. This greatly enhanced the speed with which programs could be developed. As a consultant for this upgrade, John von Neumann proposed a processing architecture based on a single memory region holding program instructions and data, a processing component with an arithmetic logic unit and registers, and a control unit that contained an instruction register and a program counter. Many modern processors continue to implement this general structure, now known as the von Neumann architecture. We will discuss this architecture in detail in Chapter 3, Processor Elements.
Early applications of ENIAC included analyses related to the development of the hydrogen bomb and the computation of firing tables for long-range artillery.
In the years following the construction of ENIAC, several technological breakthroughs resulted in remarkable advances in computer architectures:
The invention of the transistor in 1947 by John Bardeen, Walter Brattain, and William Shockley delivered a vast improvement over the vacuum tube technology prevalent at the time. Transistors were faster, smaller, consumed less power, and, once production processes had been sufficiently optimized, were much more reliable than the failure-prone tubes.The commercialization of integrated circuits in 1958, led by Jack Kilby of Texas Instruments, began the process of combining large numbers of formerly discrete components onto a single chip of silicon.In 1971, Intel began production of the first commercially available microprocessor, the Intel 4004. The 4004 was intended for use in electronic calculators and was specialized to operate on 4-bit binary-coded decimal digits.From the humble beginnings of the Intel 4004, microprocessor technology advanced rapidly over the ensuing decade by packing increasing numbers of circuit elements onto each chip and expanding the capabilities of the microprocessors implemented on those chips.
IBM released the IBM PC in 1981. The original PC contained an Intel 8088 microprocessor running at a clock frequency of 4.77 MHz and featured 16 KB of Random Access Memory (RAM), expandable to 256 KB. It included one or, optionally, two floppy disk drives. A color monitor was also available. Later versions of the PC supported more memory, but because portions of the address space had been reserved for video memory and Read-Only Memory (ROM), the architecture could support a maximum of 640 KB of RAM.
The 8088 contained 14 16-bit registers. Four were general-purpose registers (AX, BX, CX, and DX). Four were memory segment registers (CS, DS, SS, and ES) that extended the address space to 20 bits. Segment addressing functioned by adding a 16-bit segment register value, shifted left by 4 bit positions, to a 16-bit offset contained in an instruction to produce a physical memory address within a 1 MB range.
The remaining 8088 registers were the Stack Pointer (SP), the Base Pointer (BP), the Source Index (SI), the Destination Index (DI), the Instruction Pointer (IP), and the Status Flags (FLAGS). Modern x86 processors employ an architecture remarkably similar to this register set (Chapter 10, Modern Processor Architectures and Instruction Sets, will cover the details of the x86 architecture). The most obvious differences between the 8088 and x86 are the extension of the register widths to 32 bits in x86 and the addition of a pair of segment registers (FS and GS) that are used today primarily as data pointers in multithreaded operating systems.
The 8088 had an external data bus width of 8 bits, which meant it took two bus cycles to read or write a 16-bit value. This was a performance downgrade compared to the earlier 8086 processor, which employed a 16-bit external bus. However, the use of the 8-bit bus made the PC more economical to produce and provided compatibility with lower-cost 8-bit peripheral devices. This cost-sensitive design approach helped reduce the purchase price of the PC to a level accessible to more potential customers.
Program memory and data memory shared the same address space and the 8088 accessed memory over a single bus. In other words, the 8088 implemented the von Neumann architecture. The 8088 instruction set included instructions for data movement, arithmetic, logical operations, string manipulation, control transfer (conditional and unconditional jumps, and subroutine call and return), input/output, and additional miscellaneous functions. The processor required about 15 clock cycles per instruction on average, resulting in an execution speed of 0.3 million instructions per second (MIPS).
The 8088 supported nine distinct modes for addressing memory. This variety of modes was needed to efficiently access a single item at a time as well as for iterating over sequences of data.
The segment registers in the 8088 architecture provided a seemingly clever way to expand the range of addressable memory without increasing the length of most instructions referencing memory locations. Each segment register allowed access to a 64-kilobyte block of memory beginning at a physical memory address defined at a multiple of 16 bytes. In other words, the 16-bit segment register represented a 20-bit base address with the lower four bits set to zero. Instructions could then reference any location within the 64-kilobyte segment using a 16-bit offset from the address defined by the segment register.
The CS register selected the code segment location in memory and was used in fetching instructions and performing jumps and subroutine calls and returns. The DS register defined the data segment location for use by instructions involving the transfer of data to and from memory. The SS register set the stack segment location, which was used for local memory allocation within subroutines and for storing subroutine return addresses.
Programs that required less than 64 kilobytes in each of the code, data, and stack segments could ignore the segment registers entirely because those registers could be set once at program startup (programming language compilers would do this automatically) and remain unchanged through execution. Easy!
Things got quite a bit more complicated when a program’s data size increased beyond 64 kilobyte. Though the use of segment registers resulted in a clean hardware design, using those registers caused many headaches for software developers. Compilers for the 8088 architecture distinguished between near and far references to memory. A near pointer represented a 16-bit offset from the current segment register base address. A far pointer contained 32 bits of addressing information: a 16-bit segment register value and a 16-bit offset. Far pointers consumed an additional 16 bits of data memory and they also required additional processing time.
Making single memory access using a far pointer involved the following steps:
Save the current segment register contents to a temporary locationLoad the new segment value into the registerAccess the data (reading or writing as needed) using an offset from the segment baseRestore the original segment register valueWhen using far pointers, it was possible to declare data objects (for example, an array of characters representing a document in a text editor) up to 64 KB in size. If you needed a larger structure, you had to work out how to break it into chunks no larger than 64 KB and manage them yourself. As a result of such segment register manipulations, programs that required extensive access to data items larger than 64 KB became quite complex and were susceptible to code size bloat and slower execution.
The IBM PC motherboard contained a socket for an optional Intel 8087 floating-point coprocessor. The designers of the 8087 invented data formats and processing rules for 32-bit and 64-bit floating-point numbers that became enshrined in 1985 as the IEEE 754 floating-point standard, which remains in near-universal use today. The 8087 could perform about 50,000 floating-point operations per second. We will look at floating-point processing in detail in Chapter 9, Specialized Processor Extensions.
The second generation of the IBM PC, the PC AT, was released in 1984. AT stood for Advanced Technology, which referred to several significant enhancements over the original PC that mostly resulted from the use of the Intel 80286 processor.
Like the 8088, the 80286 was a 16-bit processor, and it maintained backward compatibility with the 8088: 8088 code could run unmodified on the 80286. The 80286 had a 16-bit data bus and 24 address lines supporting a 16-megabyte address space. The external data bus width was 16 bits, improving data access performance over the 8-bit bus of the 8088. The instruction execution rate (instructions per clock cycle) was about double the 8088 in many applications. This meant that at the same clock speed, the 80286 would be twice as fast as the 8088. The original PC AT clocked the processor at 6 MHz and a later version operated at 8 MHz. The 6 MHz variant of the 80286 achieved an instruction execution rate of about 0.9 MIPS, roughly three times that of the 8088.
The 80286 implemented a protected virtual address mode intended to support multiuser operating systems and multitasking.
In protected mode, the processor enforced memory protection to ensure one user’s programs could not interfere with the operating system or with other users’ programs. This groundbreaking technological advance in personal computing remained little used for many years, mainly because of the prohibitive cost of adding sufficient memory to a computer system to make it useful in a multiuser, multitasking context.
Following the 80286, the next generation of the x86 processor line was the 80386, introduced in 1985. The 80386 was a 32-bit processor with support for a flat 32-bit memory model in protected mode. The flat memory model allowed programmers to address up to 4 GB directly, without the need to manipulate segment registers. Compaq introduced an IBM PC-compatible personal computer based on the 80386 called the DeskPro in 1986. The DeskPro shipped with a version of Microsoft Windows targeted to the 80386 architecture.
The 80386 maintained substantial backward compatibility with the 80286 and 8088 processors. The processor architecture implemented in the 80386 remains the current standard x86 architecture. We will examine this architecture in detail in Chapter 10, Modern Processor Architectures and Instruction Sets.
The initial version of the 80386 was clocked at 33 MHz and achieved about 11.4 MIPS. Modern implementations of the x86 architecture run several hundred times faster than the original as the result of higher clock speeds, performance enhancements, including the extensive use of multilevel cache memory, and more efficient instruction execution at the hardware level. We will examine the benefits of cache memory in Chapter 8, Performance-Enhancing Techniques.
In 2007, Steve Jobs introduced the iPhone to a world that had no idea it had any use for such a device. The iPhone built upon previous revolutionary advances from Apple Computer, including the Macintosh computer, released in 1984, and the iPod music player of 2001. The iPhone combined the functions of the iPod, a mobile telephone, and an internet-connected computer.
The iPhone did away with the hardware keyboard that was common on smartphones of the time and replaced it with a touchscreen capable of displaying an onscreen keyboard, or any other type of user interface. In addition to touches for selecting keyboard characters and pressing buttons, the screen supported multi-finger gestures for actions such as zooming a photo.
The iPhone ran the OS X operating system, the same OS used on the flagship Macintosh computers of the time.
This decision immediately enabled the iPhone to support a vast range of applications already developed for Macs and empowered software developers to rapidly introduce new applications tailored to the iPhone, after Apple began allowing third-party application development.
The iPhone 1 had a 3.5” screen with a resolution of 320x480 pixels. It was 0.46 inches thick (thinner than other smartphones), had a built-in 2-megapixel camera, and weighed 4.8 oz. A proximity sensor detected when the phone was held to the user’s ear and turned off screen illumination and touchscreen sensing during calls. It had an ambient light sensor to automatically set the screen brightness and an accelerometer that detected whether the screen was being held in portrait or landscape orientation.
The iPhone 1 included 128 MB of RAM and 4 GB, 8 GB, or 16 GB of flash memory, and supported Global System for Mobile communications (GSM) cellular communication, Wi-Fi (802.11b/g), and Bluetooth.
