E-Book
41,99 €

Modern Computer Architecture and Organization – Second Edition E-Book

Jim Ledin

0,0

41,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

Are you a software developer, systems designer, or computer architecture student looking for a methodical introduction to digital device architectures, but are overwhelmed by the complexity of modern systems? This step-by-step guide will teach you how modern computer systems work with the help of practical examples and exercises. You’ll gain insights into the internal behavior of processors down to the circuit level and will understand how the hardware executes code developed in high-level languages.

This book will teach you the fundamentals of computer systems including transistors, logic gates, sequential logic, and instruction pipelines. You will learn details of modern processor architectures and instruction sets including x86, x64, ARM, and RISC-V. You will see how to implement a RISC-V processor in a low-cost FPGA board and write a quantum computing program and run it on an actual quantum computer.

This edition has been updated to cover the architecture and design principles underlying the important domains of cybersecurity, blockchain and bitcoin mining, and self-driving vehicles.

By the end of this book, you will have a thorough understanding of modern processors and computer architecture and the future directions these technologies are likely to take.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 991

Veröffentlichungsjahr: 2022

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Architecting High-Performance Embedded Systems

Jim Ledin

Modern Computer Architecture and Organization

Jim Ledin

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Der größte Raubzug der Geschichte

Matthias Weik

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Radikal führen

Reinhard K. Sprenger

30 Minuten Sympathisch und souverän: So geht Vortragen!

Thomas Lorenz

BLACKOUT - Morgen ist es zu spät

Marc Elsberg

The Truth About Employee Engagement

Modern Computer Architecture and Organization

Second Edition

Learn x86, ARM, and RISC-V architectures and the design of smartphones, PCs, and cloud servers

Jim Ledin

BIRMINGHAM—MUMBAI

Modern Computer Architecture and Organization

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Senior Publishing Product Manager: Denim Pinto

Acquisition Editor – Peer Reviews: Saby Dsilva

Project Editor: Namrata Katare

Content Development Editor: Edward Doxey

Copy Editor: Safis Editing

Technical Editor: Tejas Mhasvekar

Proofreader: Safis Editing

Indexer: Subalakshmi Govindhan

Presentation Designer: Ganesh Bhadwalkar

First published: April 2020

Second edition: May 2022

Production reference: 2260422

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80323-451-9

www.packt.com

Foreword

I am a software developer, not a hardware engineer. I have spent my career building software of all different kinds to solve lots of different kinds of problems. However, as a quirk, accident or fate, I have spent a fair amount of my software development career closer to the hardware than many, maybe most, software developers do these days.

In the early years of my fascination with computers I quickly discovered that the, by today’s standards, incredibly crude devices that I had access to, couldn’t really do anything very interesting unless I learned how to program them in assembler. So, I learned to program them in Z80, and later 6502 and 80x86 assembler.

Programming in assembler is different in lots of ways to programming in higher level languages. It immediately puts you next to the hardware. You can’t ignore how memory is laid out, you need to adjust your code for it. You can’t ignore the registers at your disposal, they are your variables and you need to marshal them carefully. You also learn how to communicate with other devices through I/O ports, which is, ultimately, the only way that digital devices communicate with each other. Once, when working on a particularly tricky problem, I woke up in the middle of the night and realised that I had been dreaming in 80x86 assembly language.

My career, and more importantly the hardware I had access to, developed. I got my dream job, at the time, working in the R&D division of a computer manufacturer. I worked on enhancing operating systems to work with our hardware and built device drivers to take advantage of some of the unique features of our PCs. Again, it was essential in this kind of work to have a good working knowledge of how the hardware worked.

Software development evolved. The languages that we used became more abstract, the operating systems, virtual machines, containers and public cloud infrastructure increasingly hid the details of the underlying hardware from us software developers. I recently spoke to a LISP programmer on social media who didn’t realise that ultimately his lovely functional declarative structures got translated to opcodes and values in the registers of a CPU. He seemingly had no working model for how the computers that he relied upon worked. He didn’t have to, but I think he would be a better programmer if he did.

In the latter part of my career I worked on some world-class high performance systems. A team I led was tasked with building one the world’s highest performance financial exchanges.

In order to do so, once again we needed to dig in and really understand how the underlying hardware of our system worked. This allowed us to take full advantage of the staggering levels of performance that modern hardware is capable of. During this time we stole a term from motor-racing to try to describe our approach. In the 1970’s the best Formula 1 racing driver was Jackie Stewart. He was interviewed and asked, “do you need to be an engineer to be a great racing driver?”. Jackie responded, “no, but you must have Mechanical Sympathy for the car.” In essence, you need to understand the capabilities of the underlying hardware to take advantage of them.

We adopted this idea of Mechanical Sympathy and applied it to our work. For example, the biggest cost in our trading system was a cache-miss. If the data we wanted to process wasn’t in the appropriate cache when it was needed, we’d see orders of magnitude wiped off the performance of our system. So we needed to design our code, even though it was written in a high-level language running in a virtual machine, to maximise our chances that the data was in the cache. We needed to understand and manage the concurrency in our multi-core processors, and recognise and take advantage of things like processor cache lines and the essentially block storage nature of memory and other storage devices. The result was levels of performance that some people didn’t think possible. Modern hardware is very impressive when you take advantage of it.

This interest in the hardware isn’t just for high-performance computing. Estimates vary, but all agree that a significant fraction of the carbon that we emit as a species comes from powering the data-centres where our code lives. I can’t think of any field of human endeavor that is as inefficient as software—for most systems, a speed increase of up to 100 times is easy if you do just a bit more work to manage the flow of information through your hardware. Nearly all systems can attain a 1000-fold increase with some more focused work, however if we could gain even a 10x improvement by better understanding how our code works and how it uses the hardware that it operates on, we could reduce the carbon footprint of computing by a factor of 10 too. That is an idea that is much more important than performance for performance’s sake.

Ultimately, there is a degree to which you must understand how your computer works, and there are risks to losing touch with how the hardware we all depend upon functions. I confess that I am a nerd. I love to understand how things work. Not everyone needs to push hardware to its limits, but it is a bad idea to treat it like magic, because it’s not magic. It is engineering, and engineering is always about trade-offs. You will be surprised how often the fundamentals of how your hardware works leaks out and affects the behaviour of your software, however far up the stack it sits—even if we are writing cloud-based systems in LISP.

For someone like me, Jim Ledin’s Modern Computer Architecture and Organization, Second Edition, is a delight.

I am not a hardware engineer, and I don’t want to be. For me though, a vital part of my skills as a software developer includes having a good working model for how the hardware that I rely upon, actually works. I want to maintain, and build, mechanical sympathy.

This book takes us from the basic concepts of computation, looking at the first computers and the first CPUs, to the potential of quantum computing and other near-future directions that our hardware will probably exploit. You might want to understand how a modern processor works, and get to grips with their staggering efficiency and their ability to keep themselves fed with data from stores that are hundreds of times slower than they are. You may also be interested in complex ideas that extend beyond the confines of only the hardware, such as how cryptocurrency mining works, or what the architecture of a modern self-driving car, looks like. This book can answer those questions and many, many more.

I think that it is not just computer scientists and engineers, but indeed every software developer, who will be better at their job when they have some understanding of how the devices that they use in their everyday work. When trying to understand something big and complicated in software, I still, frequently think, “well it’s all just bits, bytes and opcodes really, so what is going on here?” This is the equivalent of a chemist understanding molecules and compounds and being able to go back to first principles to solve something tricky. These are the real building blocks, and it can help us all to understand them better.

I know that I will be dipping into this book on a regular basis for years to come, and I hope that you enjoy doing the same.

Dave FarleyIndependent Software Engineering Consultant and Founder of Continuous Delivery Ltd

Contributors

About the author

Jim Ledin is the CEO of Ledin Engineering, Inc. Jim is an expert in embedded software and hardware design and testing. He is also an expert in system cybersecurity assessment and penetration testing. He has a B.S. degree in aerospace engineering from Iowa State University and an M.S. degree in electrical and computer engineering from the Georgia Institute of Technology. Jim is a registered professional electrical engineer in California, a Certified Information System Security Professional (CISSP), a Certified Ethical Hacker (CEH), and a Certified Penetration Tester (CPT).

I would like to thank my wife Lynda and daughter Emily for their patience and support as I focused on this project. I love you, my sweeties!

I would also like to thank Dr. Sarah M. Neuwirth and Iztok Jeras for their diligent work reviewing each of these chapters. Your input has helped create a much better book!

Thank you as well to Dave Farley for providing such an eloquent foreword.

About the reviewers

Dr. Sarah M. Neuwirth is a postdoctoral research associate in the Modular Supercomputing and Quantum Computing Group at Goethe-University Frankfurt, Germany. She also holds a visiting researcher position at the Jülich Supercomputing Centre, Germany. Sarah has more than 9 years of experience in the academic field. Her research interests include high-performance storage systems, parallel I/O and file systems, modular supercomputing (i.e., resource disaggregation and virtualization), standardized cluster benchmarking, high performance computing and networking, distributed systems, and communication protocols.

In addition, Sarah has designed the curriculum for, and taught, courses in parallel computer architecture, high performance interconnection networks, and distributed systems for the past 9 years. In 2018, Sarah was awarded her Ph.D. in computer science from the Heidelberg University, Germany. She defended her degree with highest honors (summa cum laude) and was the recipient of the ZONTA Science Award 2019 for her outstanding dissertation. Sarah also holds a M.Sc. degree (2012) and B.Sc. degree (2010) in computer science and mathematics from the University of Mannheim, Germany. She has served as a technical reviewer for several prestigious HPC-related conferences and journals, including the IEEE/ACM SC Conference Series, ACM ICPP, IEEE IDPDS, IEEE HPCC, the PDSW workshop at IEEE/ACM SC, the PERMAVOST workshop at ACM HPDC, ACM TOCS, IEEE Access, and Elsevier’s FGCS.

Iztok Jeras obtained a bachelor’s degree in electrical engineering and a master’s in computer science at the University of Ljubljana. He worked at several Slovenian companies on micro-controller, FPGA, and ASIC designs with some embedded software and Linux IT work mixed in. In his spare time, he researches cellular automata and contributes to digital design-related open source projects. Recently he has been focusing on the RISC-V ISA.

Join our community Discord space

Join the book’s Discord workspace for a monthly Ask me Anything session with the author: https://discord.gg/7h8aNRhRuY

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Introducing Computer Architecture

Technical requirements

The evolution of automated computing devices

Charles Babbage’s Analytical Engine

ENIAC

IBM PC

The Intel 8088 microprocessor

The Intel 80286 and 80386 microprocessors

The iPhone

Moore’s law

Computer architecture

Representing numbers with voltage levels

Binary and hexadecimal numbers

The 6502 microprocessor

The 6502 instruction set

Summary

Exercises

Digital Logic

Technical requirements

Electrical circuits

The transistor

Logic gates

Latches

Flip-flops

Registers

Adders

Propagation delay

Clocking

Sequential logic

Hardware description languages

VHDL

Summary

Exercises

Processor Elements

Technical requirements

A simple processor

Control unit

Executing an instruction – a simple example

Arithmetic logic unit

Registers

The instruction set

Addressing modes

Immediate addressing mode

Absolute addressing mode

Absolute indexed addressing mode

Indirect indexed addressing mode

Instruction categories

Memory load and store instructions

Stack instructions

Arithmetic instructions

Logical instructions

Branching instructions

Subroutine call and return instructions

Processor flag instructions

Interrupt-related instructions

No operation instruction

Interrupt processing

processing

BRK instruction processing

Input/output operations

Programmed I/O

Interrupt-driven I/O

Direct memory access

Summary

Exercises

Computer System Components

Technical requirements

Memory subsystem

Introducing the MOSFET

Constructing DRAM circuits with MOSFETs

The capacitor

The DRAM bit cell

DDR5 SDRAM

Graphics DDR

Prefetching

I/O subsystem

Parallel and serial data buses

PCI Express

SATA

M.2

USB

Thunderbolt

Graphics displays

VGA

DVI

HDMI

DisplayPort

Network interface

Ethernet

Wi-Fi

Keyboard and mouse

Keyboard

Mouse

Modern computer system specifications

Summary

Exercises

Hardware-Software Interface

Technical requirements

Device drivers

The parallel port

PCIe device drivers

Device driver structure

BIOS

UEFI

The boot process

BIOS boot

UEFI boot

Trusted boot

Embedded devices

Operating systems

Processes and threads

Scheduling algorithms and process priority

Multiprocessing

Summary

Exercises

Specialized Computing Domains

Technical requirements

Real-time computing

Real-time operating systems

Digital signal processing

ADCs and DACs

DSP hardware features

Signal processing algorithms

Convolution

Digital filtering

Fast Fourier transform (FFT)

GPU processing

GPUs as data processors

Big data

Deep learning

Examples of specialized architectures

Summary

Exercises

Processor and Memory Architectures

Technical requirements

The von Neumann, Harvard, and modified Harvard architectures

The von Neumann architecture

The Harvard architecture

The modified Harvard architecture

Physical and virtual memory

Paged virtual memory

Page status bits

Memory pools

Memory management unit

Summary

Exercises

Performance-Enhancing Techniques

Technical requirements

Cache memory

Multilevel processor caches

Static RAM

Level 1 cache

Direct-mapped cache

Set associative cache

Processor cache write policies

Level 2 and level 3 processor caches

Instruction pipelining

Superpipelining

Pipeline hazards

Micro-operations and register renaming

Conditional branches

Simultaneous multithreading

SIMD processing

Summary

Exercises

Specialized Processor Extensions

Technical requirements

Privileged processor modes

Handling interrupts and exceptions

Protection rings

Supervisor mode and user mode

System calls

Floating-point arithmetic

The 8087 floating-point coprocessor

The IEEE 754 floating-point standard

Power management

Dynamic voltage frequency scaling

System security management

Trusted Platform Module

Thwarting cyberattackers

Summary

Exercises

Modern Processor Architectures and Instruction Sets

Technical requirements

x86 architecture and instruction set

The x86 register set

x86 addressing modes

Implied addressing

Immediate addressing

Direct memory addressing

Indexed addressing

Based indexed addressing

Based indexed addressing with scaling

x86 instruction categories

Data movement

Stack manipulation

Arithmetic and logic

Conversions

Control flow

String manipulation

Flag manipulation

Input/output

Protected mode

Miscellaneous instructions

Other instruction categories

Common instruction patterns

x86 instruction formats

x86 assembly language

x64 architecture and instruction set

The x64 register set

x64 instruction categories and formats

x64 assembly language

32-bit ARM architecture and instruction set

The ARM register set

ARM addressing modes

Immediate

Double register indirect

Double register indirect with scaling

ARM instruction categories

Load/store

Stack manipulation

Arithmetic and logic

Comparisons

Control flow

Supervisor mode

Breakpoint

Conditional execution

Other instruction categories

32-bit ARM assembly language

64-bit ARM architecture and instruction set

64-bit ARM assembly language

Summary

Exercises

The RISC-V Architecture and Instruction Set

Technical requirements

The RISC-V architecture and applications

The RISC-V base instruction set

Computational instructions

Control flow instructions

Memory access instructions

System instructions

Pseudo-instructions

Privilege levels

RISC-V extensions

The M extension

The A extension

The C extension

The F and D extensions

Other extensions

RISC-V variants

64-bit RISC-V

Standard RISC-V configurations

RISC-V assembly language

Implementing RISC-V in an FPGA

Summary

Exercises

Processor Virtualization

Technical requirements

Introducing virtualization

Types of virtualization

Operating system virtualization

Application virtualization

Network virtualization

Storage virtualization

Categories of processor virtualization

Trap-and-emulate virtualization

Paravirtualization

Binary translation

Hardware emulation

Virtualization challenges

Unsafe instructions

Shadow page tables

Security

Virtualizing modern processors

x86 processor virtualization

x86 hardware virtualization

ARM processor virtualization

RISC-V processor virtualization

Virtualization tools

VirtualBox

VMware Workstation

VMware ESXi

KVM

Xen

QEMU

Virtualization and cloud computing

Electrical power consumption

Summary

Exercises

Domain-Specific Computer Architectures

Technical requirements

Architecting computer systems to meet unique requirements

Smartphone architecture

iPhone 13 Pro Max

Personal computer architecture

Alienware Aurora Ryzen Edition R10 gaming desktop

Ryzen 9 5950X branch prediction

Nvidia GeForce RTX 3090 GPU

Aurora subsystems

Warehouse-scale computing architecture

WSC hardware

Rack-based servers

Hardware fault management

Electrical power consumption

The WSC as a multilevel information cache

Deploying a cloud application

Neural networks and machine learning architectures

Intel Nervana neural network processor

Summary

Exercises

Cybersecurity and Confidential Computing Architectures

Technical requirements

Cybersecurity threats

Cybersecurity threat categories

Cyberattack techniques

Types of malware

Post-exploitation actions

Features of secure hardware

Identify what needs to be protected

Anticipate all types of attacks

Features of secure system design

Secure key storage

Encryption of data at rest

Encryption of data in transit

Cryptographically secure key generation

Secure boot procedure

Tamper-resistant hardware design

Confidential computing

Designing for security at the architectural level

Avoid security through obscurity

Comprehensive secure design

The principle of least privilege

Zero trust architecture

Ensuring security in system and application software

Common software weaknesses

Buffer overflow

Cross-site scripting

SQL injection

Path traversal

Source code security scans

Summary

Exercises

Blockchain and Bitcoin Mining Architectures

Technical requirements

Introduction to blockchain and bitcoin

The SHA-256 hash algorithm

Computing SHA-256

Bitcoin core software

The bitcoin mining process

Bitcoin mining pools

Mining with a CPU

Mining with a GPU

Bitcoin mining computer architectures

Mining with FPGAs

Mining with ASICs

Bitcoin mining economics

Alternative types of cryptocurrency

Summary

Exercises

Self-Driving Vehicle Architectures

Technical requirements

Overview of self-driving vehicles

Driving autonomy levels

Safety concerns of self-driving vehicles

Hardware and software requirements for self-driving vehicles

Sensing vehicle state and the surroundings

GPS, speedometer, and inertial sensors

Video cameras

Radar

Lidar

Sonar

Perceiving the environment

Convolutional neural networks

Example CNN implementation

CNNs in autonomous driving applications

Lidar localization

Object tracking

Decision processing

Lane keeping

Complying with the rules of the road

Avoiding objects

Planning the vehicle path

Autonomous vehicle computing architecture

Tesla HW3 Autopilot

Summary

Exercises

Quantum Computing and Other Future Directions in Computer Architectures

Technical requirements

The ongoing evolution of computer architectures

Extrapolating from current trends

Moore’s law revisited

The third dimension

Increased device specialization

Potentially disruptive technologies

Quantum physics

Spintronics

Quantum computing

Quantum code-breaking

Adiabatic quantum computation

The future of quantum computing

Carbon nanotubes

Building a future-tolerant skill set

Continuous learning

College education

Conferences and literature

Summary

Exercises

Appendix

Answers to Exercises

Chapter 1: Introducing Computer Architecture

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Exercise 5

Answer

Exercise 6

Answer

Chapter 2: Digital Logic

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Exercise 5

Answer

Exercise 6

Answer

Chapter 3: Processor Elements

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Exercise 5

Answer

Exercise 6

Answer

Chapter 4: Computer System Components

Exercise 1

Answer

Exercise 2

Answer

Chapter 5: Hardware-Software Interface

Exercise 1

Answer

Exercise 2

Answer

Chapter 6: Specialized Computing Domains

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 7: Processor and Memory Architectures

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 8: Performance-Enhancing Techniques

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 9: Specialized Processor Extensions

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Exercise 5

Answer

Exercise 6

Answer

Exercise 7

Answer

Exercise 8

Answer

Chapter 10: Modern Processor Architectures and Instruction Sets

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Exercise 5

Answer

Exercise 6

Answer

Exercise 7

Answer

Exercise 8

Answer

Chapter 11: The RISC-V Architecture and Instruction Set

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 12: Processor Virtualization

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 13: Domain-Specific Computer Architectures

Exercise 1

Answer

Exercise 2

Answer

Chapter 14: Cybersecurity and Confidential Computing Architectures

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Chapter 15: Blockchain and Bitcoin Mining Architectures

Exercise 1

Answer

Exercise 2

Answer

Chapter 16: Self-Driving Vehicle Architectures

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Chapter 17: Future Directions in Computer Architectures

Exercise 1

Answer

Exercise 2

Answer

Exercise 3

Answer

Exercise 4

Answer

Other Books You May Enjoy

Index

Landmarks

Cover

Index

Preface

Welcome to the second edition of Modern Computer Architecture and Organization. It has been my pleasure to receive a great deal of feedback and comments from readers of the first edition. Of course, I appreciate all input from readers, especially those who bring any errors and omissions to my attention.

This book presents the key technologies and components employed in modern processor and computer architectures and discusses how various architectural decisions result in computer configurations optimized for specific needs.

To understate the situation quite drastically, modern computers are complicated devices. Yet, when viewed in a hierarchical manner, the functions of each level of complexity become clear. We will cover a great many topics in these chapters and will only have space to explore each of them to a limited degree. My goal is to provide a coherent introduction to each important technology and subsystem you might find in a modern computing device and explain its relationship to other system components.

This edition includes updates on technologies that have advanced since the publication of the first edition and adds significant new content in several important areas related to computer architecture. New chapters cover the topics of cybersecurity, blockchain and bitcoin mining, and self-driving vehicle computing architectures.

While the security of computing systems has always been important, recent exploitations of major vulnerabilities in widely used operating systems and applications have resulted in substantial negative impacts felt in countries around the world. These cyberattacks have accentuated the need for computer system designers to incorporate cybersecurity as a foundational element of system architecture.

I will not be providing a lengthy list of references for further reading. The internet is your friend in this regard.

If you can manage to bypass the clamor of political and social media argumentation on the internet, you will find yourself in an enormous, cool, quiet library containing a vast quantity of accumulated human knowledge. Learn to use the advanced features of your favorite search engine. Also, learn to differentiate high-quality information from uninformed opinion. Check multiple sources if you have any doubts about the information you’re finding. Consider the source: if you are looking for information about an Intel processor, search for documentation published by Intel.

By the end of this book, you will have gained a strong grasp of the computer architectures currently used in a wide variety of digital systems. You will also have developed an understanding of the relevant trends in architectural technology currently underway, as well as some possible disruptive advances in the coming years that may drastically influence the architectural development of computing systems.

Who this book is for

This book is intended for software developers, computer engineering students, system designers, computer science professionals, reverse engineers, and anyone else seeking to understand the architecture and design principles underlying all types of modern computer systems, from tiny embedded devices to smartphones to warehouse-sized cloud server farms. Readers will also explore the directions these technologies are likely to take in the coming years. A general understanding of computer processors is helpful but is not required.

What this book covers

Chapter 1, Introducing Computer Architecture, begins with a brief history of automated computing devices and describes the significant technological advances that drove leaps in capability. This is followed by a discussion of Moore’s law, with an assessment of its applicability over previous decades and the implications for the future. The basic concepts of computer architecture are introduced in the context of the 6502 microprocessor.

Chapter 2, Digital Logic, introduces transistors as switching elements and explains their use in constructing logic gates. We will then see how flip-flops and registers are developed by combining simple gates. The concept of sequential logic, meaning logic that contains state information, is introduced, and the chapter ends with a discussion of clocked digital circuits.

Chapter 3, Processor Elements, begins with a conceptual description of a generic processor. We will examine the concepts of the instruction set, register set, and instruction loading, decoding, execution, and sequencing.

Memory load and store operations are also discussed. The chapter includes a description of branching instructions and their use in looping and conditional processing. Some practical considerations are introduced that lead to the necessity for interrupt processing and I/O operations.

Chapter 4, Computer System Components, discusses computer memory and its interface to the processor, including multilevel caching. I/O requirements, including interrupt handling, buffering, and dedicated I/O processors, are described. We will discuss some specific requirements for I/O devices, including the keyboard and mouse, the video display, and the network interface. The chapter ends with descriptive examples of these components in modern computer applications, including smart mobile devices, personal computers, gaming systems, cloud servers, and dedicated machine learning systems.

Chapter 5, Hardware-Software Interface, discusses the implementation of the high-level services a computer operating system must provide, including disk I/O, network communications, and interactions with users. This chapter describes the software layers that implement these features, starting at the level of the processor instruction set and registers. Operating system functions, including booting, multiprocessing, and multithreading, are also described.

Chapter 6, Specialized Computing Domains, explores domains of computing that tend to be less directly visible to most users, including real-time systems, digital signal processing, and GPU processing. We will discuss the unique requirements associated with each of these domains and look at examples of modern devices implementing these features.

Chapter 7, Processor and Memory Architectures, takes an in-depth look at modern processor architectures, including the von Neumann, Harvard, and modified Harvard variants. The chapter discusses the implementation of paged virtual memory. The practical implementation of memory management functionality within the computer architecture is introduced and the functions of the memory management unit are described.

Chapter 8, Performance-Enhancing Techniques, discusses a number of performance-enhancing techniques used routinely to reach peak execution speed in real-world computer systems. The most important techniques for improving system performance, including the use of cache memory, instruction pipelining, instruction parallelism, and SIMD processing, are the subjects of this chapter.

Chapter 9, Specialized Processor Extensions, focuses on extensions commonly implemented at the processor instruction set level to provide additional system capabilities beyond generic data processing requirements. The extensions presented include privileged processor modes, floating-point mathematics, power management, and system security management.

Chapter 10, Modern Processor Architectures and Instruction Sets, examines the architectures and instruction set features of modern processor designs, including the x86, x64, and ARM processors. One challenge that arises when producing a family of processors over several decades is the need to maintain backward compatibility with code written for earlier-generation processors. The need for legacy support tends to increase the complexity of the later-generation processors. This chapter will examine some of the attributes of these processor architectures that result from supporting legacy requirements.

Chapter 11, The RISC-V Architecture and Instruction Set, introduces the exciting new RISC-V (pronounced risk five) processor architecture and its instruction set. RISC-V is a completely open source, free-to-use specification for a reduced instruction set computer architecture. A complete instruction set specification has been released and a number of hardware implementations of this architecture are currently available. Work is ongoing to develop specifications for a number of instruction set extensions. This chapter covers the features and variants available in the RISC-V architecture and introduces the RISC-V instruction set. We will also discuss the applications of the RISC-V architecture in mobile devices, personal computers, and servers.

Chapter 12, Processor Virtualization, introduces the concepts involved in processor virtualization and explains the many benefits resulting from the use of virtualization. The chapter includes examples of virtualization based on open source tools and operating systems. These tools enable the execution of instruction set-accurate representations of various computer architectures and operating systems on a general-purpose computer. We will also discuss the benefits of virtualization in the development and deployment of real-world software applications.

Chapter 13, Domain-Specific Computer Architectures, brings together the topics discussed in previous chapters to develop an approach for architecting a computer system design to meet unique user requirements. We will discuss some specific application categories, including mobile devices, personal computers, gaming systems, internet search engines, and neural networks.

Chapter 14, Cybersecurity and Confidential Computing Architectures, focuses on the security needs of critical application areas like national security systems and financial transaction processing. These systems must be resilient against a broad range of cybersecurity threats, including malicious code, covert channel attacks, and attacks enabled by physical access to computing hardware. Topics addressed in this chapter include cybersecurity threats, encryption, digital signatures, and secure hardware and software design.

The explosion of interest in cryptocurrencies and their growing acceptance by mainstream financial institutions and retailers demonstrate that this area of computing is on a continued growth path. This edition adds a chapter on blockchain and the computational demands of bitcoin mining.

Chapter 15, Blockchain and Bitcoin Mining Architectures, introduces the concepts associated with blockchain, a public, cryptographically secured ledger recording a sequence of transactions. We continue with an overview of the process of bitcoin mining, which appends transactions to the bitcoin blockchain and rewards those who complete this task with payment in the form of bitcoin. Bitcoin processing requires high-performance computing hardware, which is illustrated in terms of a current-generation bitcoin mining computer architecture.

The continuing growth in the number of automobiles with partial or full self-driving capabilities demands robust, highly capable computing systems that meet the requirements for safe autonomous vehicle operation on public roadways.

Chapter 16, Self-Driving Vehicle Architectures, describes the capabilities required in self-navigating vehicle processing architectures. It begins with a discussion of the requirements for ensuring the safety of the autonomous vehicle and its occupants, as well as for other vehicles, pedestrians, and stationary objects. We continue with a discussion of the types of sensors and data a self-driving vehicle receives as input while driving and a description of the types of processing required for effective vehicle control. The chapter concludes with an overview of an example self-driving computer architecture.

Chapter 17, Quantum Computing and Other Future Directions in Computer Architectures, looks at the road ahead for computer architectures. This chapter reviews the significant advances and ongoing trends that have resulted in the current state of computer architectures and extrapolates these trends in possible future directions. Potentially disruptive technologies are discussed that could alter the path of future computer architectures. In closing, I will propose some approaches for professional development for the computer architect that should result in a future-tolerant skill set.

As in the other chapters, each of the three new chapters contains end-of-chapter exercises designed to broaden your understanding of the chapter topic and cement the information from the chapter within your knowledge base.

I hope you enjoy this updated edition as much as I have enjoyed developing it. Happy reading!

To get the most out of this book

Each chapter in this book includes a set of exercises at the end. To get the most from the book, and to cement some of the more challenging concepts in your mind, I recommend you try to work through each exercise. Complete solutions to all exercises are provided in the book and are available online at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition.

In case there is a need to update the code examples and answers to the exercises, updates will appear at this GitHub repository.

Download the example code files

The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781803234519_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “Subtraction using the SBC instruction tends to be a bit more confusing to novice 6502 assembly language programmers.”

A block of code is set as follows:

; Add four bytes together using immediate addressing modeLDA#$04CLCADC#$03ADC#$02ADC#$01

Any command-line input or output is written as follows:

C:\>bcdedit Windows Boot Manager -------------------- identifier {bootmgr}

Bold: Indicates a new term, an important word, or words that you see on the screen, for example, in menus or dialog boxes, also appear in the text like this. For example: “Because there are now four sets, the Set field in the physical address reduces to two bits and the Tag field increases to 24 bits.”

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected], and mention the book’s title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book we would be grateful if you would report this to us. Please visit, http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit http://authors.packtpub.com.

Share your thoughts

Once you’ve read Modern Computer Architecture and Organization, Second Edition, we’d love to hear your thoughts! Please clickheretogostraighttotheAmazonreviewpage for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

1 Introducing Computer Architecture

The architectures of automated computing systems have evolved from the first mechanical calculators constructed nearly two centuries ago to the broad array of modern electronic computer technologies we use directly and indirectly every day. Along the way, there have been stretches of incremental technological improvement interspersed with disruptive advances that drastically altered the trajectory of the industry. We can expect these trends to continue in the coming years.

In the 1980s, during the early days of personal computing, students and technical professionals eager to learn about computer technology had a limited range of subject matter available for this purpose. If they had a computer of their own, it was probably an IBM PC or an Apple II. If they worked for an organization with a computing facility, they might have used an IBM mainframe or a Digital Equipment Corporation VAX minicomputer. These examples, and a limited number of similar systems, encompassed most people’s exposure to the computer systems of the time.

Today, numerous specialized computing architectures exist to address widely varying user needs. We carry miniature computers in our pockets and purses that can place phone calls, record video, and function as full participants on the internet. Personal computers remain popular in a format outwardly similar to the PCs of past decades. Today’s PCs, however, are orders of magnitude more capable than the early generations in terms of computing power, memory size, disk space, graphics performance, and communication ability. These capabilities enable modern PCs to easily perform tasks that would have been inconceivable on early PCs, such as the real-time generation of high-resolution 3D images.

Companies offering web services to hundreds of millions of users construct vast warehouses filled with thousands of tightly coordinated computer systems capable of responding to a constant stream of user requests with extraordinary speed and precision. Machine learning systems are trained through the analysis of enormous quantities of data to perform complex activities such as driving automobiles.

This chapter begins with a presentation of some key historical computing devices and the leaps in technology associated with them. We will then examine some significant modern-day trends related to technological advances and introduce the basic concepts of computer architecture, including a close look at the 6502 microprocessor and its instruction set. The following topics will be covered in this chapter:

The evolution of automated computing devicesMoore’s lawComputer architecture

Technical requirements

Files for this chapter, including answers to the exercises, are available at https://github.com/PacktPublishing/Modern-Computer-Architecture-and-Organization-Second-Edition.

The evolution of automated computing devices

This section reviews some classic machines from the history of automated computing devices and focuses on the major advances each embodied. Babbage’s Analytical Engine is included here because of the many leaps of genius represented in its design. The other systems are discussed because they embodied significant technological advances and performed substantial real-world work over their lifetimes.

Charles Babbage’s Analytical Engine

Although a working model of the Analytical Engine was never constructed, the detailed notes Charles Babbage developed from 1834 until his death in 1871 described a computing architecture that appeared to be both workable and complete. The Analytical Engine was intended to serve as a general-purpose programmable computing device. The design was entirely mechanical and was to be constructed largely of brass. The Analytical Engine was designed to be driven by a shaft powered by a steam engine.

Borrowing from the punched cards of the Jacquard loom, the rotating studded barrels used in music boxes, and the technology of his earlier Difference Engine (also never completed in his lifetime, and more of a specialized calculating device than a computer), the Analytical Engine’s design was, otherwise, Babbage’s original creation.

Unlike most modern computers, the Analytical Engine represented numbers in signed decimal form. The decision to use base-10 numbers rather than the base-2 logic of most modern computers was the result of a fundamental difference between mechanical technology and digital electronics. It is straightforward to construct mechanical wheels with 10 positions, so Babbage chose the human-compatible base-10 format because it was not significantly more technically challenging than using some other number base. Simple digital circuits, on the other hand, are not capable of maintaining 10 different states with the ease of a mechanical wheel.

All numbers in the Analytical Engine consisted of 40 decimal digits. The large number of digits was likely chosen to reduce problems with numerical overflow. The Analytical Engine did not support floating-point mathematics.

Each number was stored on a vertical axis containing 40 wheels, with each wheel capable of resting in 10 positions corresponding to the digits 0-9. A 41st number wheel contained the sign: any even number on this wheel represented a positive sign, and any odd number represented a negative sign. The Analytical Engine axis was somewhat analogous to the register used in modern processors, except the readout of an axis was destructive—reading an axis would set it to 0. If it was necessary to retain an axis’s value after it had been read, another axis had to store a copy of the value during the readout. Numbers were transferred from one axis to another, or used in computations, by engaging a gear with each digit wheel and rotating the wheel to extract the numerical value. The set of axes serving as system memory was referred to as the store.

The addition of two numbers used a process somewhat similar to the method of addition taught to schoolchildren. Assume a number stored on one axis, let’s call it the addend, was to be added to a number on another axis that we will call the accumulator. The machine would connect each addend digit wheel to the corresponding accumulator digit wheel through a train of gears. It would then simultaneously rotate each addend digit downward to 0 while driving the accumulator digit an equivalent rotation in the increasing direction. If an accumulator digit wrapped around from 9 to 0, the next most significant accumulator digit would increment by 1. This carry operation would propagate across as many digits as needed (think of adding 1 to 999,999). By the end of the process, the addend axis would hold the value 0 and the accumulator axis would hold the sum of the two numbers. The propagation of carries from one digit to the next was the most mechanically complex part of the addition process.

Operations in the Analytical Engine were sequenced by music box-like rotating barrels in a construct called the mill, which is analogous to the control unit of a modern CPU.

Each Analytical Engine instruction was encoded in a vertical row of locations on the barrel, where the presence or absence of a stud at a particular location either engaged a section of the Engine’s machinery or left the state of that section unchanged. Based on Babbage’s estimate of the Engine’s execution speed, the addition of two 40-digit numbers, including the propagation of carries, would take about 3 seconds.

Babbage conceived several important concepts for the Engine that remain relevant to modern computer systems. His design supported a degree of parallel processing consisting of simultaneous multiplication and addition operations that accelerated the computation of series of values intended to be output as numerical tables. Mathematical operations such as addition supported a form of pipelining, in which sequential operations on different data values overlapped in time.

Babbage was well aware of the difficulties associated with complex mechanical devices, such as friction, gear backlash, and wear over time. To prevent errors caused by these effects, the Engine incorporated mechanisms called lockings that were applied during data transfers across axes. The lockings forced the number wheels into valid positions and prevented the accumulation of small errors from allowing a wheel to drift to an incorrect value. The use of lockings is analogous to the amplification of potentially weak input signals to produce stronger outputs by the digital logic gates in modern processors.

The Analytical Engine was to be programmed using punched cards and supported branching operations and nested loops. The most complex program intended for execution on the Analytical Engine was developed by Ada Lovelace to compute the Bernoulli numbers, an important sequence in number theory. The Analytical Engine code to perform this computation is recognized as the first published computer program of substantial complexity.

Babbage constructed a trial model of a portion of the Analytical Engine mill, which is currently on display at the Science Museum in London.

ENIAC

ENIAC, the Electronic Numerical Integrator and Computer, was completed in 1945 and was the first programmable general-purpose electronic computer. The system consumed 150 kilowatts of electricity, occupied 1,800 square feet of floor space, and weighed 27 tons.

The design was based on vacuum tubes, diodes, and relays. ENIAC contained over 17,000 vacuum tubes that functioned as switching elements.

Similar to the Analytical Engine, it used base-10 representation of 10-digit decimal numbers implemented using 10-position ring counters (the ring counter will be discussed in Chapter 2, Digital Logic).

Input data was received from an IBM punch-card reader and the output of computations was delivered by a card punch machine.

The ENIAC architecture was capable of complex sequences of processing steps including loops, branches, and subroutines. The system had 20 10-digit accumulators that functioned like registers in modern computers. It did not initially have any memory beyond the accumulators. If intermediate values were required for use in later computations, the data had to be written to punch cards and read back in when needed. ENIAC could perform about 385 multiplications per second.

ENIAC programs consisted of plugboard wiring and switch-based function tables. Programming the system was an arduous process that often took the team of talented female programmers weeks to complete. Reliability was a problem, as vacuum tubes failed regularly, requiring troubleshooting on a day-to-day basis to isolate and replace failed tubes.

In 1948, ENIAC was improved by adding the ability to program the system via punch cards rather than plugboards. This greatly enhanced the speed with which programs could be developed. As a consultant for this upgrade, John von Neumann proposed a processing architecture based on a single memory region holding program instructions and data, a processing component with an arithmetic logic unit and registers, and a control unit that contained an instruction register and a program counter. Many modern processors continue to implement this general structure, now known as the von Neumann architecture. We will discuss this architecture in detail in Chapter 3, Processor Elements.

Early applications of ENIAC included analyses related to the development of the hydrogen bomb and the computation of firing tables for long-range artillery.

IBM PC

In the years following the construction of ENIAC, several technological breakthroughs resulted in remarkable advances in computer architectures:

The invention of the transistor in 1947 by John Bardeen, Walter Brattain, and William Shockley delivered a vast improvement over the vacuum tube technology prevalent at the time. Transistors were faster, smaller, consumed less power, and, once production processes had been sufficiently optimized, were much more reliable than the failure-prone tubes.The commercialization of integrated circuits in 1958, led by Jack Kilby of Texas Instruments, began the process of combining large numbers of formerly discrete components onto a single chip of silicon.In 1971, Intel began production of the first commercially available microprocessor, the Intel 4004. The 4004 was intended for use in electronic calculators and was specialized to operate on 4-bit binary-coded decimal digits.

From the humble beginnings of the Intel 4004, microprocessor technology advanced rapidly over the ensuing decade by packing increasing numbers of circuit elements onto each chip and expanding the capabilities of the microprocessors implemented on those chips.

The Intel 8088 microprocessor

IBM released the IBM PC in 1981. The original PC contained an Intel 8088 microprocessor running at a clock frequency of 4.77 MHz and featured 16 KB of Random Access Memory (RAM), expandable to 256 KB. It included one or, optionally, two floppy disk drives. A color monitor was also available. Later versions of the PC supported more memory, but because portions of the address space had been reserved for video memory and Read-Only Memory (ROM), the architecture could support a maximum of 640 KB of RAM.

The 8088 contained 14 16-bit registers. Four were general-purpose registers (AX, BX, CX, and DX). Four were memory segment registers (CS, DS, SS, and ES) that extended the address space to 20 bits. Segment addressing functioned by adding a 16-bit segment register value, shifted left by 4 bit positions, to a 16-bit offset contained in an instruction to produce a physical memory address within a 1 MB range.

The remaining 8088 registers were the Stack Pointer (SP), the Base Pointer (BP), the Source Index (SI), the Destination Index (DI), the Instruction Pointer (IP), and the Status Flags (FLAGS). Modern x86 processors employ an architecture remarkably similar to this register set (Chapter 10, Modern Processor Architectures and Instruction Sets, will cover the details of the x86 architecture). The most obvious differences between the 8088 and x86 are the extension of the register widths to 32 bits in x86 and the addition of a pair of segment registers (FS and GS) that are used today primarily as data pointers in multithreaded operating systems.

The 8088 had an external data bus width of 8 bits, which meant it took two bus cycles to read or write a 16-bit value. This was a performance downgrade compared to the earlier 8086 processor, which employed a 16-bit external bus. However, the use of the 8-bit bus made the PC more economical to produce and provided compatibility with lower-cost 8-bit peripheral devices. This cost-sensitive design approach helped reduce the purchase price of the PC to a level accessible to more potential customers.

Program memory and data memory shared the same address space and the 8088 accessed memory over a single bus. In other words, the 8088 implemented the von Neumann architecture. The 8088 instruction set included instructions for data movement, arithmetic, logical operations, string manipulation, control transfer (conditional and unconditional jumps, and subroutine call and return), input/output, and additional miscellaneous functions. The processor required about 15 clock cycles per instruction on average, resulting in an execution speed of 0.3 million instructions per second (MIPS).

The 8088 supported nine distinct modes for addressing memory. This variety of modes was needed to efficiently access a single item at a time as well as for iterating over sequences of data.

The segment registers in the 8088 architecture provided a seemingly clever way to expand the range of addressable memory without increasing the length of most instructions referencing memory locations. Each segment register allowed access to a 64-kilobyte block of memory beginning at a physical memory address defined at a multiple of 16 bytes. In other words, the 16-bit segment register represented a 20-bit base address with the lower four bits set to zero. Instructions could then reference any location within the 64-kilobyte segment using a 16-bit offset from the address defined by the segment register.

The CS register selected the code segment location in memory and was used in fetching instructions and performing jumps and subroutine calls and returns. The DS register defined the data segment location for use by instructions involving the transfer of data to and from memory. The SS register set the stack segment location, which was used for local memory allocation within subroutines and for storing subroutine return addresses.

Programs that required less than 64 kilobytes in each of the code, data, and stack segments could ignore the segment registers entirely because those registers could be set once at program startup (programming language compilers would do this automatically) and remain unchanged through execution. Easy!

Things got quite a bit more complicated when a program’s data size increased beyond 64 kilobyte. Though the use of segment registers resulted in a clean hardware design, using those registers caused many headaches for software developers. Compilers for the 8088 architecture distinguished between near and far references to memory. A near pointer represented a 16-bit offset from the current segment register base address. A far pointer contained 32 bits of addressing information: a 16-bit segment register value and a 16-bit offset. Far pointers consumed an additional 16 bits of data memory and they also required additional processing time.

Making single memory access using a far pointer involved the following steps:

Save the current segment register contents to a temporary locationLoad the new segment value into the registerAccess the data (reading or writing as needed) using an offset from the segment baseRestore the original segment register value

When using far pointers, it was possible to declare data objects (for example, an array of characters representing a document in a text editor) up to 64 KB in size. If you needed a larger structure, you had to work out how to break it into chunks no larger than 64 KB and manage them yourself. As a result of such segment register manipulations, programs that required extensive access to data items larger than 64 KB became quite complex and were susceptible to code size bloat and slower execution.

The IBM PC motherboard contained a socket for an optional Intel 8087 floating-point coprocessor. The designers of the 8087 invented data formats and processing rules for 32-bit and 64-bit floating-point numbers that became enshrined in 1985 as the IEEE 754 floating-point standard, which remains in near-universal use today. The 8087 could perform about 50,000 floating-point operations per second. We will look at floating-point processing in detail in Chapter 9, Specialized Processor Extensions.

The Intel 80286 and 80386 microprocessors

The second generation of the IBM PC, the PC AT, was released in 1984. AT stood for Advanced Technology, which referred to several significant enhancements over the original PC that mostly resulted from the use of the Intel 80286 processor.

Like the 8088, the 80286 was a 16-bit processor, and it maintained backward compatibility with the 8088: 8088 code could run unmodified on the 80286. The 80286 had a 16-bit data bus and 24 address lines supporting a 16-megabyte address space. The external data bus width was 16 bits, improving data access performance over the 8-bit bus of the 8088. The instruction execution rate (instructions per clock cycle) was about double the 8088 in many applications. This meant that at the same clock speed, the 80286 would be twice as fast as the 8088. The original PC AT clocked the processor at 6 MHz and a later version operated at 8 MHz. The 6 MHz variant of the 80286 achieved an instruction execution rate of about 0.9 MIPS, roughly three times that of the 8088.

The 80286 implemented a protected virtual address mode intended to support multiuser operating systems and multitasking.

In protected mode, the processor enforced memory protection to ensure one user’s programs could not interfere with the operating system or with other users’ programs. This groundbreaking technological advance in personal computing remained little used for many years, mainly because of the prohibitive cost of adding sufficient memory to a computer system to make it useful in a multiuser, multitasking context.

Following the 80286, the next generation of the x86 processor line was the 80386, introduced in 1985. The 80386 was a 32-bit processor with support for a flat 32-bit memory model in protected mode. The flat memory model allowed programmers to address up to 4 GB directly, without the need to manipulate segment registers. Compaq introduced an IBM PC-compatible personal computer based on the 80386 called the DeskPro in 1986. The DeskPro shipped with a version of Microsoft Windows targeted to the 80386 architecture.

The 80386 maintained substantial backward compatibility with the 80286 and 8088 processors. The processor architecture implemented in the 80386 remains the current standard x86 architecture. We will examine this architecture in detail in Chapter 10, Modern Processor Architectures and Instruction Sets.

The initial version of the 80386 was clocked at 33 MHz and achieved about 11.4 MIPS. Modern implementations of the x86 architecture run several hundred times faster than the original as the result of higher clock speeds, performance enhancements, including the extensive use of multilevel cache memory, and more efficient instruction execution at the hardware level. We will examine the benefits of cache memory in Chapter 8, Performance-Enhancing Techniques.

The iPhone

In 2007, Steve Jobs introduced the iPhone to a world that had no idea it had any use for such a device. The iPhone built upon previous revolutionary advances from Apple Computer, including the Macintosh computer, released in 1984, and the iPod music player of 2001. The iPhone combined the functions of the iPod, a mobile telephone, and an internet-connected computer.

The iPhone did away with the hardware keyboard that was common on smartphones of the time and replaced it with a touchscreen capable of displaying an onscreen keyboard, or any other type of user interface. In addition to touches for selecting keyboard characters and pressing buttons, the screen supported multi-finger gestures for actions such as zooming a photo.

The iPhone ran the OS X operating system, the same OS used on the flagship Macintosh computers of the time.

This decision immediately enabled the iPhone to support a vast range of applications already developed for Macs and empowered software developers to rapidly introduce new applications tailored to the iPhone, after Apple began allowing third-party application development.

The iPhone 1 had a 3.5” screen with a resolution of 320x480 pixels. It was 0.46 inches thick (thinner than other smartphones), had a built-in 2-megapixel camera, and weighed 4.8 oz. A proximity sensor detected when the phone was held to the user’s ear and turned off screen illumination and touchscreen sensing during calls. It had an ambient light sensor to automatically set the screen brightness and an accelerometer that detected whether the screen was being held in portrait or landscape orientation.

The iPhone 1 included 128 MB of RAM and 4 GB, 8 GB, or 16 GB of flash memory, and supported Global System for Mobile communications (GSM) cellular communication, Wi-Fi (802.11b/g), and Bluetooth.