114,99 €
UNDERSTANDING INFRASTRUCTURE EDGE COMPUTING A comprehensive review of the key emerging technologies that will directly impact areas of computer technology over the next five years Infrastructure edge computing is the model of data center and network infrastructure deployment which distributes a large number of physically small data centers around an area to deliver better performance and to enable new economical applications. It is vital for those operating at business or technical levels to be positioned to capitalize on the changes that will occur as a result of infrastructure edge computing. This book provides a thorough understanding of the growth of internet infrastructure from its inception to the emergence of infrastructure edge computing. Author Alex Marcham, an acknowledged leader in the field who coined the term 'infrastructure edge computing,' presents an accessible, accurate, and expansive view of the next generation of internet infrastructure. The book features illustrative examples of 5G mobile cellular networks, city-scale AI systems, self-driving cars, drones, industrial robots, and more--technologies that increase efficiency, save time and money, and improve safety. Covering state-of-the-art topics, this timely and authoritative book: * Presents a clear and accurate survey of the key emerging technologies that will impact data centers, 5G networks, artificial intelligence and cyber-physical systems, and other areas of computer technology * Explores how and why Internet infrastructure has evolved to where it stands today and where it needs to be in the near future * Covers a wide range of topics including distributed application workload operation, infrastructure and application security, and related technologies such as multi-access edge computing (MEC) and fog computing * Provides numerous use cases and examples of real-world applications which depend upon underlying edge infrastructure Written for Information Technology practitioners, computer technology practitioners, and students, Understanding Infrastructure Edge Computing is essential reading for those looking to benefit from the coming changes in computer technology.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 679
Veröffentlichungsjahr: 2021
Cover
Title Page
Copyright Page
Dedication Page
Preface
How to Use This Book
About This Book
Audience
About the Author
Acknowledgements
1 Introduction
2 What Is Edge Computing?
2.1 Overview
2.2 Defining the Terminology
2.3 Where Is the Edge?
2.4 A Brief History
2.5 Why Edge Computing?
2.6 Basic Edge Computing Operation
2.7 Summary
References
3 Introduction to Network Technology
3.1 Overview
3.2 Structure of the Internet
3.3 The OSI Model
3.4 Ethernet
3.5 IPv4 and IPv6
3.6 Routing and Switching
3.7 LAN, MAN, and WAN
3.8 Interconnection and Exchange
3.9 Fronthaul, Backhaul, and Midhaul
3.10 Last Mile or Access Networks
3.11 Network Transport and Transit
3.12 Serve Transit Fail (STF) Metric
3.13 Summary
References
4 Introduction to Data Centre Technology
4.1 Overview
4.2 Physical Size and Design
4.3 Cooling and Power Efficiency
4.4 Airflow Design
4.5 Power Distribution
4.6 Redundancy and Resiliency
4.7 Environmental Control
4.8 Data Centre Network Design
4.9 Information Technology (IT) Equipment Capacity
4.10 Data Centre Operation
4.11 Data Centre Deployment
4.12 Summary
References
5 Infrastructure Edge Computing Networks
5.1 Overview
5.2 Network Connectivity and Coverage Area
5.3 Network Topology
5.4 Transmission Medium
5.5 Scaling and Tiered Network Architecture
5.6 Other Considerations
5.7 Summary
6 Infrastructure Edge Data Centres
6.1 Overview
6.2 Physical Size and Design
6.3 Heating and Cooling
6.4 Airflow Design
6.5 Power Distribution
6.6 Redundancy and Resiliency
6.7 Environmental Control
6.8 Data Centre Network Design
6.9 Information Technology (IT) Equipment Capacity
6.10 Data Centre Operation
6.11 Brownfield and Greenfield Sites
6.12 Summary
7 Interconnection and Edge Exchange
7.1 Overview
7.2 Access or Last Mile Network Interconnection
7.3 Backhaul and Midhaul Network Interconnection
7.4 Internet Exchange
7.5 Edge Exchange
7.6 Interconnection Network Technology
7.7 Peering
7.8 Cloud On‐ramps
7.9 Beneficial Impact
7.10 Alternatives to Interconnection
7.11 Business Arrangements
7.12 Summary
8 Infrastructure Edge Computing Deployment
8.1 Overview
8.2 Physical Facilities
8.3 Site Locations
8.4 Coverage Areas
8.5 Points of Interest
8.6 Codes and Regulations
8.7 Summary
9 Computing Systems at the Infrastructure Edge
9.1 Overview
9.2 What Is Suitable?
9.3 Equipment Hardening
9.4 Rack Densification
9.5 Parallel Accelerators
9.6 Ideal Infrastructure
9.7 Adapting Legacy Infrastructure
9.8 Summary
References
10 Multi‐tier Device, Data Centre, and Network Resources
10.1 Overview
10.2 Multi‐tier Resources
10.3 Multi‐tier Applications
10.4 Core to Edge Applications
10.5 Edge to Core Applications
10.6 Infrastructure Edge and Device Edge Interoperation
10.7 Summary
11 Distributed Application Workload Operation
11.1 Overview
11.2 Microservices
11.3 Redundancy and Resiliency
11.4 Multi‐site Operation
11.5 Workload Orchestration
11.6 Infrastructure Visibility
11.7 Summary
12 Infrastructure and Application Security
12.1 Overview
12.2 Threat Modelling
12.3 Physical Security
12.4 Logical Security
12.5 Common Security Issues
12.6 Application Security
12.7 Security Policy
12.8 Summary
13 Related Technologies
13.1 Overview
13.2 Multi‐access Edge Computing (MEC)
13.3 Internet of Things (IoT) and Industrial Internet of Things (IIoT)
13.4 Fog and Mist Computing
13.5 Summary
Reference
14 Use Case Example
14.1 Overview
14.2 What Is 5G?
14.3 5G at the Infrastructure Edge
14.4 Summary
15 Use Case Example
15.1 Overview
15.2 What Is AI?
15.3 AI at the Infrastructure Edge
15.4 Summary
16 Use Case Example
16.1 Overview
16.2 What Are Cyber‐physical Systems?
16.3 Cyber‐physical Systems at the Infrastructure Edge
16.4 Summary
Reference
17 Use Case Example
17.1 Overview
17.2 What Is Cloud Computing?
17.3 Cloud Computing at the Infrastructure Edge
17.4 Summary
18 Other Infrastructure Edge Computing Use Cases
18.1 Overview
18.2 Near Premises Services
18.3 Video Surveillance
18.4 SD‐WAN
18.5 Security Services
18.6 Video Conferencing
18.7 Content Delivery
18.8 Other Use Cases
18.9 Summary
19 End to End
19.1 Overview
19.2 Defining Requirements
19.3 Success Criteria
19.4 Comparing Costs
19.5 Alternative Options
19.6 Initial Deployment
19.7 Ongoing Operation
19.8 Project Conclusion
19.9 Summary
20 The Future of Infrastructure Edge Computing
20.1 Overview
20.2 Today and Tomorrow
20.3 The Next Five Years
20.4 The Next 10 Years
20.5 Summary
21 Conclusion
Appendix A: Acronyms and Abbreviations
Index
End User License Agreement
Chapter 3
Table 3.1 OSI model layer numbers, names, and examples.
Table 3.2 Example minimum acceptable and desired average STF metrics.
Chapter 4
Table 4.1 Uptime Institute tiers (numbers, names, and brief characteristics).
Chapter 6
Table 6.1 Infrastructure edge data centre facility size categories and exampl...
Table 6.2 Typical EXP and network capabilities of infrastructure edge data ce...
Table 6.3 Example infrastructure edge data centre facility environmental cont...
Table 6.4 Example average estimates for network usage per data centre facilit...
Chapter 7
Table 7.1 Suitability of IEDC facilities for use as an EXP.
Chapter 9
Table 9.1 Equipment suitability for IEDC facilities.
Chapter 16
Table 16.1 Autonomy levels and the value of infrastructure edge computing.
Chapter 2
Figure 2.1 Infrastructure edge computing in context.
Figure 2.2 Device edge computing in context.
Figure 2.3 Self‐contained application operating on device.
Figure 2.4 Application with access to remote data centre resources.
Figure 2.5 Application with access to infrastructure edge computing resource...
Chapter 3
Figure 3.1 Routing process example.
Figure 3.2 Routing and switching at a network boundary.
Figure 3.3 LAN, MAN, and WAN networks.
Figure 3.4 Fronthaul, backhaul, and midhaul networks.
Figure 3.5 Last mile or access network interconnection failure.
Figure 3.6 Infrastructure edge computing network providing transit services....
Chapter 4
Figure 4.1 Hot and cold air containment cooling system example.
Figure 4.2 Traditional access, aggregation, and core layer network topology....
Figure 4.3 Leaf and spine, or Clos, network topology.
Chapter 5
Figure 5.1 Full mesh, partial mesh, hub and spoke, ring, and tree network to...
Figure 5.2 Partial mesh with distributed trees network topology.
Chapter 6
Figure 6.1 Example deployment of size category 1 infrastructure edge data ce...
Figure 6.2 Tiered infrastructure edge computing network with example STF met...
Figure 6.3 Example scale comparison of infrastructure edge data centre size ...
Figure 6.4 Topological hierarchy between size categories of infrastructure e...
Figure 6.5 Physical hierarchy between size categories of infrastructure edge...
Figure 6.6 Infrastructure edge data centre as single point of failure for ne...
Figure 6.7 System resiliency example: phase one.
Figure 6.8 System resiliency example: phase two.
Figure 6.9 System resiliency example: phase three.
Chapter 7
Figure 7.1 Tromboning network traffic path.
Figure 7.2 Direct network traffic path.
Figure 7.3 Backhaul and midhaul network interconnection.
Figure 7.4 Network interconnection at two IXPs.
Figure 7.5 A distributed IX utilising several physical IXPs.
Figure 7.6 EX and IX comparison example.
Chapter 9
Figure 9.1 Densified and non‐densified rack comparison example.
Chapter 10
Figure 10.1 Resource gradient between user and RNDC.
Figure 10.2 Resource gradient within infrastructure edge computing network....
Chapter 14
Figure 14.1 Example architecture for 5G RAN deployment using infrastructure ...
Chapter 15
Figure 15.1 Example architecture for distributed AI deployment using infrast...
Chapter 16
Figure 16.1 Example architecture for cyber‐physical systems using infrastruc...
Chapter 17
Figure 17.1 Example architecture for cloud computing using infrastructure ed...
Chapter 19
Figure 19.1 Example infrastructure locations in Phoenix.
Figure 19.2 Example infrastructure locations in Brownsville.
Cover Page
Title Page
Copyright Page
Dedication Page
Preface
About the Author
Acknowledgements
Table of Contents
Begin Reading
Appendix A Acronyms and Abbreviations
Index
Wiley End User License Agreement
iii
iv
v
xv
xvii
xix
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
275
276
277
278
279
280
281
282
283
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
319
320
321
322
323
324
325
326
327
328
329
330
331
332
Alex Marcham
This edition first published 2021© 2021 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Alex Marcham to be identified as the author of this work has been asserted in accordance with law.
Registered Office(s)John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial OfficeThe Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of WarrantyIn view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication DataNames: Marcham, Alex, author.Title: Understanding infrastructure edge computing : concepts, technologies and considerations / Alex Marcham.Description: Hoboken, NJ, USA : Wiley, 2021. | Includes bibliographical references and index.Identifiers: LCCN 2020050691 (print) | LCCN 2020050692 (ebook) | ISBN 9781119763239 (hardback) | ISBN 9781119763246 (adobe pdf) | ISBN 9781119763253 (epub)Subjects: LCSH: Edge computing.Classification: LCC QA76.583 .M37 2021 (print) | LCC QA76.583 (ebook) | DDC 005.75/8–dc23LC record available at https://lccn.loc.gov/2020050691LC ebook record available at https://lccn.loc.gov/2020050692
Cover Design: WileyCover Image: © Metamorworks/Shutterstock
To the Fun Police. Careful!
This book is intended to be read from start to finish in order for the reader to get the most benefit from all of the subject areas which it covers. However, for information on a specific topic, each of the chapters in this book can be read in a relatively stand‐alone manner. There is crossover between chapters in many cases, for example, between a section on the physical redundancy of an edge data centre facility in one chapter and a section describing infrastructure edge computing network level resiliency in another, where if the reader has not read the prior section, some context may be lost.
I hope however you choose to read it that you enjoy reading this book as much as I did when writing it.
As with any emerging area of technology, the information presented within this book represents a moment in time and the best practices available at that moment in time. The information here is represented to the best of the author’s knowledge and does not favour one vendor over another.
This book was written for an audience of technologists, decision makers, and engineers in the fields of telecommunications, networking, data centres, and application development and operation who are interested in new emerging areas of technology, such as edge computing, fifth generation (5G), and distributed artificial intelligence (AI).
Alex Marcham has been in the networking industry for over a decade working on wireless networks, enterprise networks, telecommunications, and edge computing. He created the terms infrastructure edge and device edge and was the primary author of the Open Glossary of Edge Computing, which is now a Linux Foundation project. When not at work, he can often be seen hiking somewhere remote.
This book would not have come to fruition were it not for the help of a few special people.
First, I would like to thank the friends whom I share each day with as we all do our best to keep each other moderately sane from one week to the next. I’ll always do my best to listen and help you as you each do for me, and I wish you all the greatest happiness and success in life. That is, unless one of you says that my hair is rubbish again, in which case we will be forced to engage in a cage fight.
Second, thank you to my family. Although we may spend a lot of time apart, physical distance is no match for our combined love of badgers, elephants, and hummingbirds. That said, it is a lot easier to maintain a set of hummingbird feeders than it would be to provide for a load of badgers or a passing herd of elephants, but this is matched by the difficulty of photographing any hummingbird properly.
Third, thanks to the team at Wiley for their insight and support for this project from start to finish. The telepathic portion of this book will be available at a later date, so this will have to do for now.
Finally, thanks to everyone I have spoken to and learned from on the topics of engineering, writing, and life in the past three decades across the world. We are the sum of our choices and experiences.
Few could have guessed the impact the internet would have on us all at its inception. Today, the internet and the services it provides are essential for billions of people across the world. It is a primary source of communication with friends, family, and our communities; it is the primary way in which we access many essential services, as well as the way that increasing numbers of us go to work, pursue our educational goals, and access sources of entertainment, all on demand.
We did not get to this point by accident. Although the current state of the internet could not have been fully foreseen decades ago, it is due to the continuous efforts of skilled and driven people from across many different disciplines that the modern internet is able to support us as it does today. The story of the internet is not one of a single grand original design; it is one of consistent iteration and ingenuity to adapt to new technical and business challenges which have emerged over the decades.
As they have in the past, new and emerging use cases are driving the evolution of internet and data centre technology. This is resulting in new generations of infrastructure which are reimagining how the internet that we all use on a daily basis should be designed, deployed, and operated as a whole.
Distributed artificial intelligence (AI) and machine learning (ML) are set to permanently reshape how many industries, from healthcare and retail to manufacturing and construction, operate due to their ability to enhance the decision‐making process and automate difficult tasks with extraordinary speed and precision. City‐scale internet of things (IoT) and cyber‐physical systems provide machines the means to interact physically with our world in ways that have been impossible or impractical to achieve before, supported by fifth generation (5G) cellular network connectivity and new versions of cloud computing, which are able to support high‐bandwidth, low‐latency, and real‐time use cases.
The key element underpinning all of these areas of advancement in both technology and business is infrastructure edge computing. It is one thing to demonstrate a use case in a laboratory environment where everything is a known variable; it is quite another to then operate a commercial service in the real world with all of the messy constraints that introduces, from cost to performance to timescales.
Edge computing is one of the most frequently mentioned emerging technologies, which many believe will make a significant impact on the landscapes of both technology and business during the decade of the 2020s. The concept seems simple: By moving compute resources as close as possible to their end users, theoretically the latency between a user and their application can be reduced, the cost of data transport can be minimised, and these two factors combined will make new use cases practical.
But what really is edge computing, beyond the hype, marketing material, and hyperbole that always accompany any major technological shift? With so many competing definitions of even the most basic elements of the technology, can we succinctly define concepts and terminology which allow us to have a consistent understanding of the challenges we are trying to solve together as an industry?
What are the key factors driving edge computing, and what must a solution provide in order to solve key technical and business challenges? How does edge computing really replace, compete with, or augment cloud computing? What is infrastructure edge computing, and does it stand alongside the traditional regional, national, and on‐premises data centre, or does it seek to replace them entirely?
This book aims to answer all of these questions and provide the reader with a solid foundation of knowledge with which to understand how we got to this inflection point and how infrastructure edge computing is a vital component of the next‐generation internet – an internet which enables suites of new key use cases that unlock untapped value globally across many different industries.
Before delving into the details and technical underpinnings of infrastructure edge computing, it is necessary to understand some of the history, terminology, and key drivers behind its development, adoption, and usage. This chapter aims to detail some of these factors and provide the reader with a shared base of knowledge to build upon throughout the rest of this book, starting with terminology.
One of the most challenging aspects of edge computing has been agreeing upon a set of terminology and using it consistently across the many industries to which edge computing is of interest. This is by no means a unique challenge when it comes to emerging technologies, but in the case of edge, it has contributed significantly to confusion between multiple groups and companies who have struggled to reconcile their individual definitions of edge computing so that ultimately a shared view of what the problem to be solved is, in addition to where it is and how to solve it, could emerge and be used.
Part of the challenge in defining edge computing is that by its very nature, the concept of an edge is contextual: An edge is at the boundary of something and often delineates the specific place where two things meet. These two things may be physical, as pieces of hardware; they may be logical, as pieces of software; or they may be more abstract, such as ownership, intent, or a business model.
Another part of the challenge has been attempting to compress the many dimensions across which a group or company may be concerned with edge computing into a small number of terms which are general enough and yet able to convey a specific meaning. Although it is appealing to create terms which describe a complex and specific set of dimensions as they relate to edge computing, this is a challenging path to create terminology which is general enough to use outside of that same group because the more dimensions a term or phrase aims to address, the less approachable it becomes.
The key to any set of terminology is consistency, and the way to achieve that even in highly technical discussions is to limit the scope of the concepts which the terminology aims to define. Once the key parameters of the definition are established, a neutral set of terminology can be created which then serves as the basis for additional layers of complexity to be added, promoting adoption and usage.
The Open Glossary of Edge Computing [1], a project arising out of the initial State of the Edge report [2] and co‐authored by the author of this book, established a neutral and limited dimension set of terminology for edge computing which has seen adoption across the industry and aims to simplify the discussions around edge computing by using the physical location of infrastructure and devices to delineate which type of edge computing each is able to perform by using the last mile network as the line between them to create a clear point of separation. Additional dimensions such as ownership, a specific business model, or any other concern can then be layered on top of this physical definition.
Along with the State of the Edge itself, the Open Glossary of Edge Computing has been adopted by the Linux Foundation’s LF Edge [3] group as an official project and continues to contribute to a shared set of terminology for edge computing to help facilitate clear discussion and shared understanding.
As previously described, an edge is itself a contextual entity. By itself, an edge cannot exist; it is the creation of two things at the point at which they interact. This somewhat floaty definition is one part of what has made establishing a concise and clear definition of edge computing difficult, especially when combined with the many different factors and dimensions that edge computing will influence.
This book will focus on the accepted definition from the Open Glossary of Edge Computing which uses the physical and role‐based separation provided by using the last mile network as a line of demarcation between the infrastructure edge and device edge to provide separation and clarity.
Although there are many potential edges, for the purposes of this book and to the most general definition of edge computing, the edge that is of the greatest importance is the last mile network.
The last mile network is the clearest point of physical separation between end user devices and the data centre infrastructure which supports them. In this context, the last mile network refers to the transmission medium and communications equipment which connects a user device to the network of a network operator who is providing wide area network (WAN) or metropolitan area network (MAN) service to one or more user devices, whether large or small, fixed position or mobile.
Examples of last mile networks include cellular networks, where the transmission medium is radio spectrum and the communications equipment used includes radio transceiver equipment, towers, and antennas. Wired networks such as those using cable, fibre, or digital subscriber line (DSL) are also examples of last mile networks which use a copper or fibre‐based transmission medium. The specific type of last mile network used is irrelevant here for the terminology of edge computing.
This definition cannot capture all of the potential nuance which may exist; for example, in the case of an on‐premises data centre which is physically located on the device side of the last mile network, the owner of that data centre may regard it as infrastructure rather than as a device itself. However, a different definition and accompanying set of terminology offering equal clarity without introducing unnecessary dimensions into the equation has not been established within the industry, and so this book will continue to use the infrastructure edge and device edge, separated by a last mile network.
Fundamentally, if everything can be recast as an example of edge computing, then nothing is truly an example of edge computing. It is similar to referring to a horse and cart as a car because both of them consist of a place to sit, four wheels, and an entity that pulls the cart forward. This is important to note with both the infrastructure edge and the device edge. In the case of the former, an existing data centre which exists a significant distance away from its end users should not be referred to as an example of edge computing. If, however, that same data centre is located within an acceptable distance from its end users and it satisfies their needs, an argument can be made for it to be so.
Similarly, if a device edge entity, such as a smartphone which already had significant local compute capabilities is now referred to as an edge computing device yet does not participate in any device‐to‐device ad hoc resource allocation and utilisation, this is a somewhat disingenuous application of the term edge computing. However, where there was once a dumb device or no device at all which is now being augmented or replaced with some local compute, storage, and network resources, this can be reasonably argued to be an example of device edge computing, even if limited in capability.
Although “edge washing” of this type is not unique to edge computing as similar processes occur for most technological changes for a period of time, due to the difficulties previously mentioned in the industry arriving at a single set of terminology around edge computing, this can be challenging to identify. This identification challenge can be addressed by using the framework described in the next section.
The infrastructure edge refers to the collection of edge data centre infrastructure which is located on the infrastructure side of the last mile network. These facilities typically take the form of micro‐modular data centres (MMDCs) which are deployed as close as possible to the last mile network and, therefore, as close as possible to the users of that network who are located on the device edge. Throughout this book, these MMDCs will typically be referred to as infrastructure edge data centres (IEDCs), whereas their larger cousins will be referred to as regional or national data centres (RNDCs).
The primary aim of edge computing is to extend compute resources to locations where they are as close as possible to their end users in order to provide enhanced performance and improvements in economics related to large‐scale data transport. The success of cloud computing in reshaping how compute resources are organised, allocated, and consumed over the past decade has driven the use of infrastructure edge computing as the primary method to achieve this goal; the infrastructure edge is where data centre facilities are located which support this usage model, unlike at the device edge.
Although it is typically deployed in a small number of large data centres today, the cloud itself is not a physical place. It is a logical entity which is able to utilise compute, storage, and network resources that are distributed across a variety of locations as long as those locations are capable of supporting the type of elastic resource allocation as their hyperscale data centre counterparts. The limited scale of an MMDC compared to a traditional hyperscale facility, where the MMDC represents only a small fraction of the total capacity of that larger facility, can be offset by the deployment of several MMDC facilities across an area with the allocation of only a physically local subset of users to each facility (see Figure 2.1).
The device edge refers to the collection of devices which are located on the device side of the last mile network. Common examples of these entities include smartphones, tablets, home computers, and game consoles; it also includes autonomous vehicles, industrial robotics systems, and devices that function as smart locks, water sensors, or connected thermostats or that can provide many other internet of things (IoT) functionalities. Whether or not a device is part of the device edge is not driven by the size, cost, or computational capabilities of that device but on which side of the last mile network that it operates. This functional division clarifies the basic architecture of an edge computing system and allows several more dimensions such as ownership, device capability, or other factors to be built on top.
Figure 2.1 Infrastructure edge computing in context.
These devices may communicate directly with the infrastructure edge using the last mile network or may use an intermediary device on the device edge such as a gateway to do so. An example of each type of device is a smartphone that has an integrated Long‐Term Evolution (LTE) modem and so is able to communicate directly with the LTE last mile network itself, and a device which instead has only local range Wi‐Fi network connectivity that is used to connect to a gateway which itself has last mile network access.
In comparison to infrastructure edge computing, many devices on the device edge are powered by batteries and subject to other power constraints due to their limited size or mobile nature. It would be possible to design cooperative processing scenarios using only device edge resources in which a device can utilise compute, storage, or network resources from neighbouring devices in an ad hoc fashion; however, for the vast majority of use cases and users, these approaches have proven to be unpopular at best with users not wishing to sacrifice their own limited battery power and processing resources to participate in such a scheme at a large scale outside of outliers such as Folding@home, a distributed computing project that is focused on using a network of mains powered computers, not mobile devices. Bearing this in mind, the need for access to dense compute resources in locations as close as possible to their users is provided to users at the device edge by the infrastructure edge (see Figure 2.2).
Although this book is primarily focused on infrastructure edge computing, topics related to device edge computing will be discussed as appropriate, especially as they relate to the interaction that exists between these two key halves of the edge computing ecosystem and their interoperation.
Figure 2.2 Device edge computing in context.
As with many technologies, upon close inspection, infrastructure edge computing represents an evolution more than the radical revolution that it may initially appear to be. This does not make it any less significant or impactful; it merely allows us to contextualise infrastructure edge computing within the broader trends which over time have driven much of the development of internet and data centre infrastructure since their inception. This progression lets us understand infrastructure edge computing not as the wild anomaly which it has been portrayed as in the past but as the clear progression of an ongoing theme in network design which has been present for decades and driven by the need to solve both key technical and business challenges using simple and proven principles.
One framework for understanding the technological progression which has brought us to the point of infrastructure edge computing is the three acts of the internet. This structure distils the evolution of the internet since its inception into three distinct phases, which culminate in the third act of the internet, a state which is driven by new use cases and enabled by infrastructure edge computing.
During the 1970s and 1980s, as the internet began to be available for academic and public use, the types of services it was able to support were basic compared to those which would emerge in the 1990s. Text‐based applications such as bulletin board systems (BBS) and early examples of email represented some of the most complex use cases of the system. With no real‐time element and a simple range of content, the level of centralisation was sufficient to support the small userbase.
It may seem obvious to us in hindsight that the internet would achieve the explosive growth that it has over its lifetime in terms of every possible characteristic from number of users to the volume of data that each individual user would transmit on a daily basis. However, it is a testament to the first principles of the design of the internet that its foundational protocols and technologies have, with the addition of more modern solutions where needed, been able to scale up over time as required.
During the 1990s and 2000s, internet usage amongst consumers became mainstream as the types of applications and content which the internet supported grew exponentially. The combination of a rapidly growing userbase as millions of people began to connect to the internet for the first time using dial‐up modem connectivity and other technologies such as cable or DSL and the addition of more types of content, as well as far more content being available online in general, began to strain the infrastructure of the internet and led to the development and deployment of the first physical infrastructure solutions, which were designed specifically to address these newly emerging issues.
The widespread advent of cloud computing during the 2010s further exacerbated this trend as new generations of data centre facilities were required globally. As more applications and data began to move from local on‐premises facilities to remote data centres, the locations of these data centres became more important. Cloud providers began to separate their infrastructure on a per‐country basis and, in the case of the United States or other large countries, then began to subdivide their presence within that country into smaller regions, as Amazon Web Services (AWS) has done with their US East and US West regions to optimise performance and the cost of data transportation.
With the internet now firmly established as a constant in the lives of billions of people across the world who rely on it every day for essential services; connectivity to work, family, and friends; and their primary source of entertainment, the same pressures which drove the evolution from the first to the second act of the internet are mounting once more. More users – now including both humans and machines which will both be essential users of the internet – and a range of new use cases that demand real‐time decision making are pushing the current generation of internet infrastructure beyond its original design intentions and capabilities from both a technical and business standpoint.
For these reasons, the 2020s are the first decade of the third act of the internet, a transformation of the network and data centre infrastructure which supports the internet on a global scale towards a new methodology of design, deployment, and operation which heavily relies on infrastructure edge computing to achieve its aims of improving performance, lowering operational costs, and enabling a new class of use cases which are impossible or impractical to support without this continued push towards new levels of network regionalisation and less reliance upon centralised infrastructure.
Now that the three acts of the internet have been established, it is worth considering additional detail in regard to network regionalisation and some early examples of this methodology being applied to the infrastructure of the internet in response to the emergence of the second act itself.
The key trend which the three acts of the internet highlights is the increasing growth of network regionalisation that has occurred over the preceding decades in response to the need to support new use cases, reduce the opportunities for network congestion across the internet, and provide a measurable increase in performance to end users. From a network perspective, which is especially crucial when we are talking about the internet which is itself a global network of networks, generally the shortest path between the source and destination of data in transit is preferable for reasons of both optimal performance and lowest cost, all other characteristics being equal across the network.
This regionalisation of internet infrastructure where key pieces of the network and the data centre move outwards from centralised locations to be deployed on a distributed and regional level is not an accident. As the number of users and their individual usage of the network increased, it became urgent to minimise the length of the network path between the source and destination of traffic.
The Advanced Research Projects Agency Network (ARPANET), first established in 1969 [4], was the precursor to the modern internet. Although other projects existed across the world to develop technologies and standards around such transformative technologies as decentralised networks, packet switching, and resilient routing of data in transit to provide a network with the ability to withstand an attack on its infrastructure, the ARPANET was by far the most influential example.
Although considered to be a leading example of a decentralised network at its inception and during the 1970s and 1980s, by the 1990s the level of centralisation in the architecture of the ARPANET was being strained under the emergence of a large number of new internet users and applications. More regionalisation of internet infrastructure was required to address these challenges, and perhaps the most influential method of achieving this was positioning static content in caches which are placed strategically throughout the network, creating a shorter path between traffic source and destination.
One of the best examples of network regionalisation to solve a specific use case as well as address the needs of network operators is the content delivery network (CDN) work done by Akamai Technologies in the late 1990s [5]. Although compared to today the internet and the world wide web it supports were still in their infancy, with both having gained mainstream acceptance only a few years previously, need for the regionalisation of key infrastructure was already beginning to show as the internet became known for distributing new multimedia content, such as images and early examples of hosted video, which began to strain its underlying networks. If left unaddressed, this strain would have limited the uptake of online services by both businesses and home users and ultimately prevented the adoption of the internet as the go‐to location for businesses, essential services, shopping, and entertainment.
The importance of CDNs and of the practical proof point of the benefits of network regionalisation which they represent cannot be understated. By deploying a large number of distributed content caching nodes throughout the internet, CDNs have drastically reduced the level of centralised load placed on internet infrastructure on a regional, national, and global scale. Today, they are a fact of life for network operators; these static caches are widely deployed in many thousands of instances from a variety of providers such as CacheFly, Cloudflare, and Akamai, who reach agreements with network operators for their deployment and operation within both wired and wireless networks which provide last mile network connectivity. This regionalisation of static content, by moving the CDN nodes to locations closer to their end users, improves the user experience and saves network operators significant sums in the backhaul network capacity which would otherwise be needed to serve the demand for the content were it located farther away in an RNDC.
Where infrastructure edge computing diverges from the historical CDN deployment model is in its ability to support a range of use cases which rely on dense compute resources to operate, such as clusters of central processing units (CPUs), graphics processing units (GPUs), or other resources which enable infrastructure edge computing to provide services beyond the distribution of static content. Many CDN deployments do not require significant compute density, nor are many of the existing telecommunications sites where they are deployed (such as shelters at the bases of cellular towers, cable headend locations, or central office locations) which were originally designed to support low‐density network switching equipment capable of supporting the difficult cooling and power delivery requirements which these dense resources impose. Additionally, in many cases infrastructure edge computing deployments bring additional network infrastructure to provide optimal paths for data transit between last mile networks and edge data centre locations and between edge data centres and RNDCs; typical CDN nodes in contrast will usually be deployed atop existing network operator infrastructure at aggregation points such as cable network headends.
It is worth mentioning here, however, that infrastructure edge computing and the CDN are not at all mutually exclusively concepts. Just as a CDN can operate from various locations across the network today by the deployment of server infrastructure in locations such as cable network headends, they are also able to operate from an IEDC. One or multiple CDNs are then able to use infrastructure edge computing facilities as deployment locations for CDN nodes to replace or augment their existing deployments which use the current infrastructure of the network operator.
Although CDNs in many ways pioneered the deployment methodology of placing numerous content caches throughout the internet to shorten the path between the source and destination of traffic, it is important to understand the distinction between a deployment methodology and a use case. The CDN is a use case which needed a deployment methodology that achieved network regionalisation in order to function. As infrastructure edge computing is deployed, CDNs can also be operated from these locations as well. This is an important point that will be revisited later on the subject of the cloud.
Now that we have established the terminology and some of the history behind the concept of edge computing, we can delve deeper into the specific factors which make this technology appealing for a wide range of use cases and users. We will return to many of these factors throughout this book, but this section will establish these factors and the basic reasoning behind their importance at the edge.
The time required for a single bit, packet, or frame of data to be successfully transmitted between its source and destination can be measured in extreme detail by a variety of mechanisms. Between the ports on a single Ethernet switch, nanosecond scale latencies can be achieved, though they are more frequently measured in microseconds. Between devices, microsecond or millisecond scale latencies are observed, and across a large‐scale WAN, such as an access or last mile access network, hundreds of milliseconds of latency are commonly experienced, especially when the traffic destination is in a remote location relative to the source of the data, as is the case when a user located on the device edge seeks to use an application being hosted in a remote centralised data centre facility.
Latency is typically considered to be the primary performance benefit which edge computing and particularly infrastructure edge computing can provide to its end users, although other performance advantages exist such as the ability to avoid current hotspots of network congestion by reducing the length of the network path between a user and the data centre running their application of choice.
Beyond a certain point of acceptability, where the required data rate is provided by the network to the application for it to function as intended, increasing the bandwidth and therefore the maximum data rate that is provided to a user or application on the network for a real‐time use case does not measurably increase their quality of experience (QoE). The primary drivers of increased user QoE are then latency, measured at its maximum, minimum, and average over a period of time, and the ability of the system to provide as close to deterministic performance as possible by avoiding congestion.
The physical distance between a user and the data centre providing their application or service is not the only factor which influences latency from the network perspective. The network topology that exists between the end user and the data centre is also of significant concern; to achieve the lowest latency, as direct a connection as possible is preferable rather than relying on many circuitous routes which introduces additional delay in data transport. In extreme cases, data may be sent away from its intended destination before taking a hairpin turn back on a return path to get there. This is referred to as a traffic trombone, with the path which the data takes resembling the shape of the instrument.
Data gravity refers to the challenge of moving large amounts of data. To move data from where it was collected or generated to a location where it can be processed or stored requires energy which can be expressed both in terms of network and processing resources as well as financial cost, which can be prohibitive when dealing with a large amount of data that has real‐time processing needs.
Additionally, many individual pieces of data that are collected or generated can, once processed, be considered noise as they do not significantly contribute to the insight which can be generated by the analysis of the data. Before processing occurs, however, it is difficult to know which pieces of data can be discarded as insignificant, and an individual device may not have all of the contextual information or the analytic processing power available to accurately make this judgement. This makes the use of infrastructure edge computing key as this processing can occur comparatively close to the source of the data before the resulting insight is sent back to a regional data centre for long‐term data storage.
Many pieces of data have a window of time in which they are most useful. If within that time period they cannot be processed and used to extract an actionable insight, the value of that data decreases exponentially. Examples of this type of data include many real‐time applications; for example, in the scenario of an industrial robotics control system, instructing the system to perform an action such as orienting a robotic arm in a certain position to catch a piece of falling material is of limited use if the command reaches the arm too late to perform that action in a safe manner before the material falls.
Data velocity is the name given to this concept. If data for real‐time applications can be processed and used to extract insight within the shortest possible span of time since its creation or collection, that data and the resulting insight are able to provide their highest possible value to their end user. This processing must occur at a point of aggregation in terms of both network topology and compute resources, such that the resulting data analysis has the full context of relevant events and the power to perform the analysis at an acceptable rate for the application and its users to prevent any issues.
Particularly with emerging use cases such as distributed artificial intelligence (AI), the cost of transporting data from the device edge locations where it is generated to a data centre location where it can be processed in real time will present a growing challenge. This is not only a technical consideration where network operators must appropriately provision upstream bandwidth in the access and midhaul layers of the network, but there is also a significant operational expenditure (OPEX) and capital expenditure (CAPEX) burden on the network operator associated with overprovisioning long‐haul network connectivity.
Infrastructure edge computing aims to address this challenge by moving the locations at which large amounts of data can undergo complex processing, for example, by distributed AI inferencing, to a set of locations which are positioned closer to the sources of this data than with today’s centralised data centres. The shorter the distance over which the bulk of data must be transmitted, the lower the data transport cost can be for the network operator which allows any use case reliant on moving such large volumes of data to be more economical and thus more practical to deploy and operate.
The locality of a system describes both the physical and logical distances between key components of the system. In the context of infrastructure edge computing, the system we are most concerned with spans from a user located on the device edge to an application operating from an edge data centre at the infrastructure edge, a facility which itself is then connected to a regional data centre.
Locality is an important concept in system design. In many ways it is the summation of all of the previously described issues in this section; by addressing all of them, locality allows infrastructure edge computing to enable a new class of use case which generates large amounts of data and needs that data to be processed in a complex fashion in real time. This is the true driving factor of why the infrastructure edge computing model is needed; new use cases in addition to useful augmentations of existing use cases require the capabilities which it offers, and these use cases are valuable enough to make the design, deployment, and operation of infrastructure edge computing itself worthwhile.
With an understanding of the basic terminology and history behind infrastructure edge computing, as well as the primary factors, beyond specific use cases, which are driving its design, deployment, and adoption, we can explore an example of how edge computing operates in practice. This example will describe how each of the primary factors are addressed by infrastructure edge computing, as well as how interoperation can occur between the device edge, infrastructure edge, and RNDCs to make a useful gradient of compute, storage, and network resources from end to end.
To begin, let’s explore the operation of an application which needs only device edge computing to function. In this scenario, all of the compute and storage resources required are provided by a local device, in this example, a smartphone. Any data that is required is being generated locally and is not obtained from a remote location as the application operates, unlike if the application were reliant on the cloud. The application is entirely self‐contained at the user’s device, and so operates as follows in Figure 2.3:
In this case, the application is limited by the capabilities of the device itself. All of the resources that the application requires, such as to process data, display a complex 3D rendering to the user, or store data which results from the user’s actions, must all be present on the local device and also available to the application. If this is not met, the application will either fail or its operation will be degraded, leaving the user with a suboptimal experience. The use of only device resources requires devices to be powerful enough to provide everything that is required by any application which the user may wish to use, which is especially detrimental to mobile devices which must be battery powered and so not capable of supporting dense amounts of compute and storage resources as may be needed.
The extent to which this is a drawback varies depending on the type of application and on the type of device in question. A lightweight application may operate exactly as intended on a device alone, whereas an application which introduces more of a mismatch between the capabilities of the device and the requirements of the application, such as performing high‐resolution real‐time computer vision for facial recognition on a battery‐powered mobile device, may either not operate at all or compromise the user experience, for example, by providing greatly reduced performance or poor battery life, to the extent that the application is unable to fulfil the needs of the user and so fails.
Figure 2.3 Self‐contained application operating on device.
Next, we will add an RNDC to the same application. This addition opens up significant new functionality and opportunities for the application but also comes with its own set of drawbacks. The user’s device is connected to the remote data centre using internet connectivity. The device connects to a last mile network, in this example a fourth generation (4G) LTE cellular network, and uses this connection to send and receive data to an application instance which is operating in the remote data centre. This application instance is now using a combination of device resources and data centre resources, most likely by utilising a public or private cloud service. Note, however, that the cloud is a not a physical place in and of itself; it is a logical service which uses physical data centre locations and the resources present inside them to provide those services to its users. This distinction will become increasingly important throughout this book as the infrastructure used by the cloud includes not only RNDCs but also IEDCs (see Figure 2.4).
In this case, the application is able to call on not just the local resources which are available at the device but also remote resources located within the remote data centre in order to perform its functions. These resources are primarily processing power and data storage, both of which are capable of adding additional capabilities and levels of performance to the application which the device alone is unable to support, and access to them often greatly enriches the user experience.
One difficulty with this case is that the RNDC is typically located a large distance away from the end user and their device. This imposes two challenges on the application: When the transmission of large amounts of data is required, that data is sent using long‐distance network connectivity which, if all other characteristics of the network are equal, is costlier and is prone to introducing more opportunities for network congestion than the network connectivity which would be required for a shorter distance between a device and its serving data centre. The other challenge is latency: Should a real‐time element of the application be required which is not possible or practical to support using the local resources of the device, then the data centre must be physically located close enough to the device for the network connectivity between them to provide acceptable latency so that the user experience will not be degraded and the application will be able to function as intended. This is often challenging as a user may be many hundreds or thousands of miles away from the data centre, which is supporting their application, exacerbating these issues.
Figure 2.4 Application with access to remote data centre resources.
Finally, let’s examine what this same use case looks like with the introduction of infrastructure edge computing. A single IEDC has been added to our previous topology, with its location being in between the user’s device and the RNDC. In addition the IEDC is interconnected with the last mile network which the device is connected to, and is connected back to the RNDC. These two elements are crucial to ensure optimal network connectivity, and they will be explored further in the next chapter.
In this case, the application has access to three sets of resources in increasing degrees of the total potential resources available: the device itself, the IEDC, and the RNDC. As can be seen in Figure 2.5
