107,99 €
Deep Learning Approaches for Security Threats in IoT Environments An expert discussion of the application of deep learning methods in the IoT security environment In Deep Learning Approaches for Security Threats in IoT Environments, a team of distinguished cybersecurity educators deliver an insightful and robust exploration of how to approach and measure the security of Internet-of-Things (IoT) systems and networks. In this book, readers will examine critical concepts in artificial intelligence (AI) and IoT, and apply effective strategies to help secure and protect IoT networks. The authors discuss supervised, semi-supervised, and unsupervised deep learning techniques, as well as reinforcement and federated learning methods for privacy preservation. This book applies deep learning approaches to IoT networks and solves the security problems that professionals frequently encounter when working in the field of IoT, as well as providing ways in which smart devices can solve cybersecurity issues. Readers will also get access to a companion website with PowerPoint presentations, links to supporting videos, and additional resources. They'll also find: * A thorough introduction to artificial intelligence and the Internet of Things, including key concepts like deep learning, security, and privacy * Comprehensive discussions of the architectures, protocols, and standards that form the foundation of deep learning for securing modern IoT systems and networks * In-depth examinations of the architectural design of cloud, fog, and edge computing networks * Fulsome presentations of the security requirements, threats, and countermeasures relevant to IoT networks Perfect for professionals working in the AI, cybersecurity, and IoT industries, Deep Learning Approaches for Security Threats in IoT Environments will also earn a place in the libraries of undergraduate and graduate students studying deep learning, cybersecurity, privacy preservation, and the security of IoT networks.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 467
Veröffentlichungsjahr: 2022
Cover
Series Page
Title Page
Copyright Page
About the Authors
1 Introducing Deep Learning for IoT Security
1.1 Introduction
1.2 Internet of Things (IoT) Architecture
1.3 Internet of Things' Vulnerabilities and Attacks
1.4 Artificial Intelligence
1.5 Deep Learning
1.6 Taxonomy of Deep Learning Models
1.7 Supplementary Materials
References
2 Deep Neural Networks
2.1 Introduction
2.2 From Biological Neurons to Artificial Neurons
2.3 Artificial Neural Network
2.4 Activation Functions
2.5 The Learning Process of ANN
2.6 Loss Functions
2.7 Supplementary Materials
References
3 Training Deep Neural Networks
3.1 Introduction
3.2 Gradient Descent Revisited
3.3 Gradient Vanishing and Explosion
3.4 Gradient Clipping
3.5 Parameter Initialization
3.6 Faster Optimizers
3.7 Model Training Issues
3.8 Supplementary Materials
References
4 Evaluating Deep Neural Networks
4.1 Introduction
4.2 Validation Dataset
4.3 Regularization Methods
4.4 Cross‐Validation
4.5 Performance Metrics
4.6 Supplementary Materials
References
5 Convolutional Neural Networks
5.1 Introduction
5.2 Shift from Full Connected to Convolutional
5.3 Basic Architecture
5.4 Multiple Channels
5.5 Pooling Layers
5.6 Normalization Layers
5.7 Convolutional Neural Networks (LeNet)
5.8 Case Studies
5.9 Supplementary Materials
References
6 Dive Into Convolutional Neural Networks
6.1 Introduction
6.2 One‐Dimensional Convolutional Network
6.3 Three‐Dimensional Convolutional Network
6.4 Transposed Convolution Layer
6.5 Atrous/Dilated Convolution
6.6 Separable Convolutions
6.7 Grouped Convolution
6.8 Shuffled Grouped Convolution
6.9 Supplementary Materials
References
7 Advanced Convolutional Neural Network
7.1 Introduction
7.2 AlexNet
7.3 Block‐wise Convolutional Network (VGG)
7.4 Network in Network
7.5 Inception Networks
7.6 Residual Convolutional Networks
7.7 Dense Convolutional Networks
7.8 Temporal Convolutional Network
7.9 Supplementary Materials
References
8 Introducing Recurrent Neural Networks
8.1 Introduction
8.2 Recurrent Neural Networks
8.3 Different Categories of RNNs
8.4 Backpropagation Through Time
8.5 Challenges Facing Simple RNNs
8.6 Case Study: Malware Detection
8.7 Supplementary Materials
References
9 Dive Into Recurrent Neural Networks
9.1 Introduction
9.2 Long Short‐Term Memory (LSTM)
9.3 LSTM with Peephole Connections
9.4 Gated Recurrent Units (GRU)
9.5 ConvLSTM
9.6 Unidirectional vs. Bidirectional Recurrent Network
9.7 Deep Recurrent Network
9.8 Insights
9.9 Case Study of Malware Detection
9.10 Supplementary Materials
References
10 Attention Neural Networks
10.1 Introduction
10.2 From Biological to Computerized Attention
10.3 Attention Pooling: Nadaraya–Watson Kernel Regression
10.4 Attention‐Scoring Functions
10.5 Multi‐Head Attention (MHA)
10.6 Self‐Attention Mechanism
10.7 Transformer Network
10.8 Supplementary Materials
References
11 Autoencoder Networks
11.1 Introduction
11.2 Introducing Autoencoders
11.3 Convolutional Autoencoder
11.4 Denoising Autoencoder
11.5 Sparse Autoencoders
11.6 Contractive Autoencoders
11.7 Variational Autoencoders
11.8 Case Study
11.9 Supplementary Materials
References
12 Generative Adversarial Networks (GANs)
12.1 Introduction
12.2 Foundation of Generative Adversarial Network
12.3 Deep Convolutional GAN
12.4 Conditional GAN
12.5 Supplementary Materials
References
13 Dive Into Generative Adversarial Networks
13.1 Introduction
13.2 Wasserstein GAN
13.3 Least‐Squares GAN (LSGAN)
13.4 Auxiliary Classifier GAN (ACGAN)
13.5 Supplementary Materials
References
14 Disentangled Representation GANs
14.1 Introduction
14.2 Disentangled Representations
14.3 InfoGAN
14.4 StackedGAN
14.5 Supplementary Materials
References
15 Introducing Federated Learning for Internet of Things (IoT)
15.1 Introduction
15.2 Federated Learning in the Internet of Things
15.3 Taxonomic View of Federated Learning
15.4 Open‐Source Frameworks
15.5 Supplementary Materials
References
16 Privacy‐Preserved Federated Learning
16.1 Introduction
16.2 Statistical Challenges in Federated Learning
16.3 Security Challenge in Federated Learning
16.4 Privacy Challenges in Federated Learning
16.5 Supplementary Materials
References
Index
End User License Agreement
Chapter 1
Table 1.1 A summary of common passive IoT attacks.
Table 1.2 A summary of common active IoT attacks.
Chapter 2
Table 2.1 Comparison between artificial neural network and biological neura...
Chapter 13
Table 13.1 A comparison between the loss functions of a GAN and a WGAN.
Table 13.2 Comparison between the loss functions of GAN, WGAN, and LSGAN.
Table 13.3 A comparison between the loss functions of CGAN and ACGAN.
Chapter 14
Table 14.1 Comparison between the loss functions of GAN and InfoGAN.
Chapter 1
Figure 1.1 Illustration of the three‐layered architecture of the IoT ecosyst...
Figure 1.2 Illustration of supervised deep learning.
Figure 1.3 Illustration of the workflow of deep unsupervised learning.
Figure 1.4 Illustration of reinforcement learning.
Figure 1.5 Illustration of federated deep learning over the IoT network.
Chapter 2
Figure 2.1 Illustration of the biological neuron (left) and artificial neuro...
Figure 2.2 Illustration of linear separable and nonlinear separable data.
Figure 2.3 Illustration of simple multi‐layer perceptron.
Figure 2.4 Illustration of the gradient descent.
Figure 2.5 Illustration of weight initialization process.
Figure 2.6 Illustration of gradient calculation as the derivative of loss fu...
Figure 2.7 Illustration of the gradient descent under (a) small learning rat...
Chapter 3
Figure 3.1 Illustration of zero‐initialization of two‐layer network.
Figure 3.2 Illustration of random‐initialization of two‐layer network.
Figure 3.3 Illustration of network underfitting (high bias).
Figure 3.4 Illustration of network overfitting (high variance).
Figure 3.5 Illustration of model capacity as a measure of overfitting and un...
Chapter 4
Figure 4.1 Illustration for data split for training, validation, and test su...
Figure 4.2 Early stopping regularization.
Figure 4.3 Illustration of dropout regularization.
Figure 4.4 Illustration of the holdout cross‐validation.
Figure 4.5 Illustration of the
k
‐fold cross‐validation.
Figure 4.6 Illustration of the stratified
k
‐fold cross‐validation.
Figure 4.7 Illustration of the repeated
k
‐fold cross‐validation.
Figure 4.8 Illustration of the stratified leave‐one‐out cross‐validation.
Figure 4.9 Illustration of the rolling cross‐validation.
Figure 4.10 Illustration of the blocking cross‐validation.
Figure 4.11 Illustration of the confusion matrix.
Figure 4.12 Illustration of receiving operating characteristics (ROC) curve....
Chapter 5
Figure 5.1 Illustration of a regular neural network compared to a convolutio...
Figure 5.2 Convolution operation applied on grid input. The colored cell in ...
Figure 5.3 Example of convolution layer with padding.
Figure 5.4 Example of convolution layer with strides of size 3 × 2.
Figure 5.5 Example of two‐dimensional convolution with two input channels.
Figure 5.6 Example of two‐dimensional convolution (1 × 1) with three input c...
Figure 5.7 Maximum pool layer with a pooling window of shape 2 × 2.
Figure 5.8 Average pool layer with a dimension of 2 × 2.
Figure 5.9 Illustration of batch normalization.
Figure 5.10 Illustration of batch normalization.
Figure 5.11 Illustration of instance normalization.
Figure 5.12 Illustration of group normalization.
Figure 5.13 The architecture of LeNet convolutional network.
Chapter 6
Figure 6.1 Illustration of the one‐dimensional convolutional layer with a ke...
Figure 6.2 Illustration of one‐dimensional pooling layers.
Figure 6.3 Illustration of three‐dimensional convolutional operation.
Figure 6.4 Illustration of transposed convolution.
Figure 6.5 Illustration of padded transposed convolution.
Figure 6.6 Illustration of padded transposed convolution.
Figure 6.7 Unrolled transposed convolution to matrix multiplication.
Figure 6.8 An illustration of transposed convolution with even overlap. (a) ...
Figure 6.9 An illustration of the transposed convolution with uneven overlap...
Figure 6.10 Illustration of the dilated (atrous) convolutional kernel.
Figure 6.11 Illustration of dilated (atrous) convolutional kernel with diffe...
Figure 6.12 Output calculation in standard convolution with a total of 3 × 3...
Figure 6.13 Illustration of a simple example of SS convolution.
Figure 6.14 Illustration of number of multiplications in standard convolutio...
Figure 6.15 Illustration of DS convolution: (a) depth‐wise convolution and (...
Figure 6.16 Illustration of standard vs. grouped convolutions: (a) a standar...
Figure 6.17 Convolution layers with three filter groups.
Figure 6.18 Shuffled convolution layers with three filter groups.
Chapter 7
Figure 7.1 Illustration of the architecture of the AlexNet vs. LsNet.
Figure 7.2 Illustration of the architecture of the VGG network.
Figure 7.3 Architecture of the NiN network.
Figure 7.4 Illustration of the architecture of sparsely connected convolutio...
Figure 7.5 Illustration of the architecture of the inception block.
Figure 7.6 Illustration of GoogLeNet model.
Figure 7.7 Illustration of the inception block v2.
Figure 7.8 Illustration of the inception block v3.
Figure 7.9 Illustration of the inception block v3.
Figure 7.10 Illustration of the residual block: (a) regular convolutional bl...
Figure 7.11 Illustration of the dense block.
Figure 7.12 Illustration of one‐dimensional convolution.
Figure 7.13 Illustration of calculation of output of one‐dimensional convolu...
Figure 7.14 Illustration of calculation of output of multichannel convolutio...
Figure 7.15 Illustration of an example of causal convolution.
Figure 7.16 Illustration of stacked causal convolution.
Figure 7.17 Illustration of temporal convolutional network with dilation bas...
Figure 7.18 Illustration of temporal convolutional network with the residual...
Figure 7.19 The final architecture of TCN.
Chapter 8
Figure 8.1 A simple recurrent neuron (left side), unrolled through time (rig...
Figure 8.2 A simple recurrent layer (left side), unrolled through time (righ...
Figure 8.3 A simple recurrent neuron with a hidden state distinct from the o...
Figure 8.4 Illustration of the simple architecture of vanilla RNN cell.
Figure 8.5 Computational graph of one‐to‐one recurrent neural network.
Figure 8.6 Computational graph of the one‐to‐many recurrent network.
Figure 8.7 Computational graph of the one‐to‐many recurrent network.
Figure 8.8 Illustration of the many‐to‐many recurrent network.
Figure 8.9 Computational graph illustrating the dependencies for a recurrent...
Chapter 9
Figure 9.1 An illustration of the generic structure of the LSTM model.
Figure 9.2 Illustration of the gating operations in an LSTM cell.
Figure 9.3 Illustration of calculation of the candidate state in an LSTM cel...
Figure 9.4 Illustration of calculation of the candidate state in an LSTM cel...
Figure 9.5 Illustration of calculation of the hidden state in an LSTM cell....
Figure 9.6 Illustration of the architecture of an LSTM cell with “Peephole C...
Figure 9.7 Illustration of computation of the gating mechanisms in a GRU cel...
Figure 9.8 Illustration of computation of the candidate state in a GRU cell....
Figure 9.9 Illustration of computation of the hidden state in a GRU cell.
Figure 9.10 Illustration of internal computation of different gates in a Con...
Figure 9.11 Illustration of the architecture of the bidirectional recurrent ...
Figure 9.12 Illustration of the architecture of the deep recurrent neural ne...
Chapter 10
Figure 10.1 Utilizing the nonvolitional cue according to saliency, attention...
Figure 10.2 Illustration of the task‐oriented volitional cue, attention is t...
Figure 10.3 Illustration of attention mechanisms based on attention over a s...
Figure 10.4 Illustration of the calculation of the output of attention pooli...
Figure 10.5 Illustration of MHA, where the output of attention heads is line...
Figure 10.6 Illustration of comparison between the self‐attention, recurrent...
Figure 10.7 Illustration of the structural design of transformer network.
Chapter 11
Figure 11.1 Illustration of generic autoencoder for reconstructing input ima...
Figure 11.2 Illustration of the architecture of vanilla autoencoder.
Figure 11.3 Illustration of the architecture of undercomplete autoencoders....
Figure 11.4 Illustration of the architecture of overcomplete autoencoders.
Figure 11.5 Illustration of the architecture of deep autoencoders.
Figure 11.6 Illustration of the generic architecture of convolutional autoen...
Figure 11.7 Illustration of the convolutional autoencoders for the reconstru...
Figure 11.8 Illustration of the architecture of denoising autoencoders.
Figure 11.9 Illustration of the architecture of sparse autoencoders.
Figure 11.10 Illustration of the generic architecture of variational autoenc...
Figure 11.11 Illustration of the architecture of variational autoencoders wi...
Figure 11.12 Illustration of the variational
inference
in variational autoen...
Figure 11.13 Illustration of reparameterization trick in variational autoenc...
Chapter 12
Figure 12.1 Illustration of the concepts of discriminator and generator with...
Figure 12.2 Illustration of the architecture of the GAN model consisting of ...
Figure 12.3 Illustration of the discriminator training as a binary classifie...
Figure 12.4 Illustration of the generator training as a binary classifier in...
Figure 12.5 Illustration of the structural design of the deep convolutional ...
Figure 12.6 Illustration of discriminator training in the conditional GAN mo...
Figure 12.7 Illustration of generator training in the conditional GAN model....
Chapter 13
Figure 13.1 Illustration of the EMD calculation between distribution
x
and d...
Figure 13.2 Training the WGAN discriminator requires fake data from the gene...
Figure 13.3 CGAN versus ACGAN generator training. The main difference is the...
Chapter 14
Figure 14.1 Illustration of the GAN with the entangled code and its variatio...
Figure 14.2 Illustration of the structural design and the training of discri...
Figure 14.3 The disentangled representation for both GAN and InfoGAN in the ...
Figure 14.4 Stack of encoders and generators in the StackedGAN.
Figure 14.5 Simple structure of one level of StackedGAN.
Figure 14.6 Generator loss calculation in StackedGAN.
Figure 14.7 Single level of Stack GAN with hidden features.
Figure 14.8 Conditional loss calculation in StackedGAN.
Figure 14.9 Entropy loss calculation in StackedGAN.
Figure 14.10 Independent training of StackedGAN.
Figure 14.11 Joint training of StackedGAN.
Chapter 15
Figure 15.1 Illustration of the main steps of training federated learning in...
Figure 15.2 Illustration of the centralized federated learning.
Figure 15.3 Illustration of the decentralized federated learning.
Figure 15.4 Illustration of categorization of federated learning based on th...
Figure 15.5 Illustration of the horizontal federated learning.
Figure 15.6 Illustration of the vertical federated learning.
Figure 15.7 Illustration of the federated transfer learning.
Chapter 16
Figure 16.1 Illustration of the concept of active learning in which the algo...
Figure 16.2 Illustration of the concept of soft and hard multitask learning:...
Figure 16.3 Illustration of the concept of transfer learning.
Figure 16.4 Illustration of the concept of active learning knowledge distill...
Figure 16.5 Illustration of the concept of blockchain FL vs. standard FL.
Cover Page
Series Page
Title Page
Copyright Page
About the Authors
Table of Contents
Begin Reading
Index
WILEY END USER LICENSE AGREEMENT
iii
iv
xv
xvi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
357
358
359
360
361
362
363
364
365
366
367
368
369
IEEE Press445 Hoes LanePiscataway, NJ 08854IEEE Press Editorial BoardSarah Spurgeon, Editor in Chief
Jón Atli Benediktsson
Andreas Molisch
Diomidis Spinellis
Anjan Bose
Saeid Nahavandi
Ahmet Murat Tekalp
Adam Drobot
Jeffrey Reed
Peter (Yong) Lian
Thomas Robertazzi
Mohamed Abdel‐Basset
Zagazig University
Egypt
Nour Moustafa
UNSW Canberra at the Australian Defence Force Academy
Australia
Hossam Hawash
Zagazig University
Egypt
Copyright © 2023 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permission.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication DataNames: Abdel-Basset, Mohamed, 1985- author. | Moustafa, Nour, author. | Hawash, Hossam, author.Title: Deep learning approaches for security threats in IoT environments / Mohamed Abdel-Basset, Zagazig University, Egypt, Nour Moustafa, UNSW Canberra at the Australian Defence Force Academy, Australia, Hossam Hawash, Zagazig University, Egypt.Description: First edition. | Hoboken, New Jersey : John Wiley & Sons, Inc., [2023] | Includes bibliographical references and index.Identifiers: LCCN 2022035671 (print) | LCCN 2022035672 (ebook) | ISBN 9781119884149 (hardback) | ISBN 9781119884156 (adobe pdf) | ISBN 9781119884163 (epub)Subjects: LCSH: Internet of things--Security measures--Data processing. | Deep learning (Machine learning)Classification: LCC TK5105.8857 .A255 2023 (print) | LCC TK5105.8857 (ebook) | DDC 004.67/8--dc23/eng/20220923LC record available at https://lccn.loc.gov/2022035671LC ebook record available at https://lccn.loc.gov/2022035672
Cover Design: WileyCover Image: ©metamorworks/Shutterstock
Mohamed Abdel‐Basset
He received the B.Sc. and M.Sc. degrees from the Faculty of Computers and Informatics, Zagazig University, Egypt, and the Ph.D. degree from the Faculty of Computers and Informatics, Menoufia University, Egypt. He is currently an Associate Professor with the Faculty of Computers and Informatics, Zagazig University. His current research interests include data mining, computational intelligence, applied statistics, deep learning, security intelligence, and IoT. He holds the program chair in many conferences in the fields of AI optimization, complexity, and editorial collaboration in some reputable journals. He is also an editor or a reviewer in different international journals and conferences.
Nour Moustafa
He is Postgraduate Discipline Coordinator (Cyber) and Senior Lecturer in Cybersecurity at the School of Engineering and Information Technology (SEIT), University of New South Wales (UNSW) Canberra, Australia. His areas of interest include Cyber Security including Network Security, intrusion detection systems, statistics, and machine learning techniques. He is interested in designing and developing threat detection and forensic mechanisms for identifying malicious activities from cloud/fog computing, IoT, and industrial control systems over cyber‐physical systems. He is also an ACM Distinguished Speaker and IEEE Senior member. He has published more than 75 research outputs in top‐tier computing and security journals and conferences.
Hossam Hawash
Hossam Hawash received the B.Sc. and M.Sc. degrees from the Department of Computer Science, Faculty of Computers and Informatics, Zagazig University, Egypt. He is currently an Assistant Lecturer with the Department of Computer Science, Zagazig University. His research interests include machine learning, deep learning, the Internet of Things (IoT), cyber security, fuzzy learning, and explainable artificial intelligence.
Internet of Things (IoT) applications and relevant technologies are presently proliferating in all sectors of daily life, including intelligent transportation, smart buildings, smart healthcare, smart manufacturing, smart farming, irrigations, etc. Many security concerns surround the massive amounts of data being sent and received by and from smart devices as they become more widely adopted. Since many IoT applications need safety and defense, authentication and classification systems, as well as sufficient technologies to ensure integrity and confidentiality, are becoming increasingly important. An additional danger posed by a criminal IoT device use is the potential impact on internet security and robustness across the board. Mirai, an IoT‐targeted malware, has demonstrated the disruptive power of malevolent operations and the need to implement proper defenses.
The purpose of this part is to emphasize the attributes of typical IoT systems and to present the most significant security risks that such systems may be subjected to in their operation. The most significant breakthrough that the IoT has brought to our world is the conversion of standard physical things into digital things that interconnect with each other through the internet using a variety of networking protocols and communication technologies such as 5G and 6G communications [1].
While it may appear to be a straightforward concept, IoT systems involve a lot of dynamic components that should operate jointly to enable them to achieve their tasks appropriately and efficiently. It is crucial for all these various cogs in the machinery or system to run collectively when the IoT solution is functioning as required. In this regard, IoT architecture could be defined as a framework that characterizes the physical elements, the operational structure, network configurations, patterns of data, as well as operational procedures to be applied. However, since IoT spans a wide range of technologies, there is no standard reference architecture. This implies that there is no unified and easy‐to‐follow template for all viable applications [2].
When it comes to the implementation of the IoT, there is a wide range of architectures and protocols that could be used to support many network applications. The architecture of the IoT system might vary greatly according to its implementation; thereby, it must be flexible or standardized enough to allow the development of a diversity of smart applications.
Despite the fact that there is no standard IoT architecture that is globally approved, a three‐layer design seems to be the most common and broadly accepted architecture for both research and industrial communities. Then, the research efforts continue trying to improve over this architecture to cope with recent developments. The three‐layer architecture of the IoT system can be displayed in Figure 1.1. As illustrated, the IoT ecosystem could be decomposed into three primary operational layers, namely the physical layer, the network layer, and the application layer. Each of these layers could be further partitioned into additional fundamental sublayers. Each level is briefly summarized in the following subsections, with special emphasis placed on the specific sublayers that can be found inside each level [3].
Figure 1.1 Illustration of the three‐layered architecture of the IoT ecosystem.
Source: Geralt/Pixabay.
The physical layer is sometimes called the perception layer and includes both perceptive activities and fundamental networking resources provided by physical IoT devices, which are all included in this layer.
For perceptive functionality, it involves the primary activities of physical things such as detecting, accumulating, and handling the data perceived from the real world to the extent that it can be done efficiently. Thus, the perception layer incorporates various sensors, such as gas sensors, proximity sensors, infrared sensors, and motion sensors, in addition to actuators, which are used to perform various activities on real‐world objects. For setup purposes, a plug‐and‐play method is typically applied at this layer to deal with the variability of sensors and actuators. Because of their limited battery capacity and compute performance, IoT devices are demonstrated as resource‐constrained devices in many ways. A significant portion of the big data volumes that are currently overflowing the cyber‐physical systems originates at this IoT layer; nevertheless, these volumes of data are in their raw format, and correct interpretation of them, at this layer, is a critical stage in developing a secure, efficient, and scalable IoT system. In fact, an efficient knowledge of big data pertaining to the IoT can result in a variety of advantages. However, this is typically the responsibility of the application layer [4, 5].
On the other hand, smart devices operate in resource‐ and power‐constrained contexts, and their communication features must be able to cope with lossy and noisy communication environments. Consequently, in order to communicate the data obtained by the sensors, low‐energy physical layer connections are required.
Among the most important technologies for IoT communications at this layer are Bluetooth, wireless fidelity (Wi‐Fi), Zigbee, ultra‐wideband (UWB), radio frequency identification devices (RFID), low‐power wide‐area network (LPWAN), and near‐field communication (NFC), all of which are confronting the aforementioned issues.
It includes communication facilities as well as middleware functions, which are both important for the sustainability of IoT systems. In the case of communication facilities, the resource restriction of IoT devices must however be delicately examined before implementation. One of the most difficult tasks at this layer is providing a distinctive internet protocol (IP) address to the millions of interconnected IoT devices that are internet‐connected. By taking advantage of the IPv6 addressing protocol, we may gradually reduce the severity of this problem. Another communication issue in this layer is the volume of the transmitted packets, which will be addressed by the adoption of appropriate protocols, such as IPv6 over low‐power wireless personal area network (6LoWPAN), that are capable of providing timely compression capabilities. A third problem pertains to transmission utilities, as transmission protocols should take into consideration the restricted resources in the physical layer, as well as the mobility and plasticity of the internet‐connected things, among other things. As a remedy, a routing protocol for low‐power and lossy (RPL) networks has been developed as a wireless vector‐dependent transmission protocol that works on IEEE 802.15.4 channels, supporting and characterized by its power efficiency. Two modes of communications are supported by this protocol, namely one‐to‐one communication as well as multi‐hop many‐to‐one communications.
On the other hand, middleware functions often relate to a software layer that sits between the application and network layers, and that is capable of addressing communication and computation concerns in a cooperative manner with the application and the network. Middleware could work as an intermediary between smart things, enabling communication among devices that could otherwise be inaccessible. As “software glue,” middleware makes things simpler for developers and engineers to establish input/output operations and communications in such a way that allows them to concentrate on the specific aim of their application rather than becoming bogged down in the technical details.
A variety of useful operations are conceivable in an IoT environment using middleware functions – first, the collaboration and interoperability between diverse IoT devices. So, the varying smart objects could communicate with others seamlessly; second, the scalability to control numerous smart things at the same time; third, device and content look‐ups; fourth, sustainability and responsiveness of IoT components; and fifth, IoT devices deeply integrated into our daily lives in the form of smart pillows, smartwatches, smart doors, and smart TVs, which raise the user's concerns about security and privacy of their stored or routed data. The middleware could empower the IoT with some security mechanisms such as identity management and authentication. Finally, when it comes to IoT and the cloud, a flawless and robust connection between these two realms is essential.
Big data analysis, business intelligence, as well as software applications logic are typically found in the uppermost layer. The physical layer of the IoT ecosystem collects a massive amount of valuable data, which is then analyzed using big data analytics. There is a huge amount of data, it is generated quickly, and there is a wide variety of styles. It is necessary to incorporate big data analytical methodologies into the overall IoT architecture, where ML algorithms can contribute significantly to extracting value from the abovementioned huge data and converting them into valuable information. Tasks devoted to delivering services for a particular IoT community based on predefined business objectives also fall under the purview of business intelligence, which falls under the application layer as well. Business intelligence usually meets data analytics, and both are often used together in this portion of the process in order to uncover aspects and make predictions or suggestions for how to improve the end result or generate the best possible tactical business strategies. The application layer also includes the software that facilitates communication between the whole IoT infrastructure and the end users, whether they are regular residents or city managers or factory managers. The IoT architecture's software components are customized to meet the specific requirements of each application. As an example, we can think of “smart cities,” “smart healthcare,” “smart transportation,” “smart agriculture,” and so forth.
A lot of research efforts have been devoted to studying and categorizing IoT vulnerabilities and attacks. Given the attacker's activity as a categorization criterion, IoT attacks can be broadly classified as passive and active attacks. However, this is coarse‐grained taxonomy. So, the two classes of attacks can be further categorized according to the definition of layers in a conventional IoT infrastructure that was provided in the previous section.
All communications are observed and copied by the attacker in passive attacks. They are primarily concerned with keeping tabs on the transmissions and accumulating the necessary data. Adversary makes no attempt to alter the material they have obtained. Despite the fact that these assaults pose no threat to the system, they might put a severe threat to the privacy of users' data.
As opposed to active attacks, which alter data or information, passive attacks are more difficult to detect. As a result, the victim is unaware of the attack. Despite the fact that some encryption techniques can be used to avoid it. As a result, the communication is rendered unintelligible by hackers at any point during transmission. To put it another way, this is why prevention is more important than detection. As shown in Table 1.1, IoT system is prone to different passive attacks including eavesdropping, node outage, node tampering, node malfunctioning, as well as traffic analysis. In multiple studies, node tampering, outage, or malfunctioning are regarded as active attacks. However, they actually belong to passive ones because they do not pose a significant risk to the network in comparison to the active ones. That is because they do not create a single point of failure in the IoT system, which could remain functioning without the participation of dying nodes.
The illegal activities performed during active attacks include attacks on privacy as well as the integrity of data. Additionally, active attacks can target unauthorized access and consumption of resources as well as the disruption of an adversary's communication channels. During an active attack, the attacker emits a radio signal or performs an action that can be detected by the IoT components. A denial‐of‐service (DoS) assault on the network or physical layers, for instance, could reason network nodes to lose data packets as a result of the attack.
Generally speaking, a DoS is a popular attack on the accessibility of networking facilities. In general, DoS is defined as a kind of circumstance that expends the network resources and reduces the networking capability, distracting the network's attention away from carrying out its functions appropriately or in a reasonable timeframe. In other words, the DoS can be considered an attempt to inhibit authorized users from gaining access to some facilities. Traditional methods of accomplishing this include flooding packets to any centralized network resource like an access point, that is, the resources become not accessible to the other nodes of that network, and as a result, the network ceases to function as expected. This could fail to provide promised services to end users as a result of this. A serious variant of DoS is known as distributed DoS (DDoS) attack, which occurs when adversaries fool or take the control of a big number of IoT devices and turn them into zombies for the purpose of creating a botnet [1].
Table 1.1 A summary of common passive IoT attacks.
Attack
Description
Eavesdropping
Communication lines can be tapped to gather sensitive information. As a result, eavesdropping is easier on wireless networks. Compared to long‐range wireless technologies, short‐range connections require an adversary to be in the near vicinity to acquire valuable information. So, IoT devices are less vulnerable to eavesdropping. The actual location of individual nodes may be revealed by intercepting IoT messages.
Node outage
When a node fails to perform as expected, this attack is launched. Network protocols must be robust as much as necessary to eliminate the undesirable consequences of node failures such as cluster head failures by selecting new cluster heads and/or offering alternative paths for network paths.
Node tampering
A node can be physically destroyed, using a physical force, electrical surge, or bullets, such that it is no longer operational.
Node malfunctioning
A variety of causes, such as overloaded sensors, other denial‐of‐service (DoS) attacks, sensory defects, or energy shortages, might result in such attacks.
Traffic analysis
An adversary extracts insightful information from traffic patterns that are as important as the network packets. To learn more about the network's structure, traffic patterns might be studied. This kind of vital information can be gleaned from traffic analysis. Other sensitive information such as behaviors and intents might also be revealed through traffic patterns. When it comes to strategic communications, silence can signify either an attack, a tactical decision, or an operative's infiltration. This can also be seen in a rapid surge in traffic, which could be an indication of an attack or raid.
For active attacks, hackers essentially impact the functions of the underlying IoT system, which could be the goal of the attack and thereby could be discovered and defended. In the case of communication networks, for instance, these assaults may cause them to be downgraded or even be canceled. Occasionally, the adversary struggles to continue hidden, seeking to acquire the illegal right to use the system resources or intimidating the privacy and/or integrity of the packets of the network [6]. A common set of active IoT attacks is categorized into three major groups according to three layers of IoT architectures, as presented in Table 1.2.
Table 1.2 A summary of common active IoT attacks.
Layer
Attack
Description
Physical
Node tampering
An attacker controls a sensory node by physically attaching it to alter its content and functioning.
Jamming DoS
By using the same frequency, a rogue device can block a signal from being sent. Signal‐to‐noise ratios are reduced underneath the level required by the nodes using that channel because of the jamming signal's contribution to noise. A zone can be continually jammed, preventing all the nodes in that zone from communicating with each other. It is also possible to momentarily disrupt communications by jamming for short periods of time at random intervals.
Network
Collision
An attacker broadcast the same channel once a legal node of the network begins transmission. Hence, the two broadcasts collide, and the receiving node becomes unable to understand the arriving data and therefore calls for retransmitting the same packet again.
Denial of sleep
This attack is performed by executing repetitive handshaking and collision attacks to thwart the node moving into the sleep phase. This could lead to energy depletion in battery‐based IoT devices.
De‐synchronization
An adversary communicates information in the timeslots devoted to another client leading the packets to be lost or collide.
Exhaustion
An attack that is executed by continuing the collision attack till the energy of aimed node is depleted.
Unfairness
Irregular use of fatigue attacks or misuse of collaborative MAC protocols leading to network unfairness.
Spoofing
A malevolent node parodies the MAC address of a target node and then makes different legal personalities outside of the victim node and utilizes these personalities somewhere else in the network.
6LoWPAN Exploit
An adversary injects their own fragments into the fragmentation chain.
Flooding
An adversary transmits advert information to the entire network and persuades the neighboring nodes.
Clone
An adversary deliberately poses duplications of a negotiated node in a lot of sites in the network to result in irregularity.
Application
Path‐based DoS
An adversary overloads the remote nodes by overflowing an end‐to‐end communication route using fake or replayed packets.
False data injection
An adversary injects false data into the measurement of mode to change its total outcome or reading intentionally.
Software attacks
A software attack is considered as the cornerstone of malicious software, such as Trojan horses, worms, or viruses, which is introduced into an infected computer system and then proceeds to do harm, such as draining resources, destroying data, money theft, or clogging up networks.
Privacy leakage
Private user data that is stored or sent by IoT smart objects is at risk of being leaked. An assault on a wearable device can yield a wealth of personal information, including heart rate, GPS location, phone calls and messages received and sent, and so on.
To answer the issue “could computers think?,” artificial intelligence (AI) was originally proposed in 1950. This led to a definition of AI in the field of computer science as a branch of computer science that focuses on addressing issues that are difficult for humans to understand but are easy for computers to understand. An authoritative, mathematical, scientific set of rules can be used to designate problems. Intrinsically, AI is a very wide‐ranging field that includes both learning‐based and non‐learning‐based techniques. Some time ago, computer science experts assumed that human‐level AI could be achieved simply by defining a considerable set of explicit directions for handling and fusing knowledge such that the scholars and specialists believed that achieving human‐level AI can be skilled just by specifying a large number of specific guidelines and regulations for handling and processing information.
Synonymous with “symbolic AI,” this paradigm dominated the AI society between 1950 and 1980. By 1980, symbolic AI had reached its zenith of fame, thanks to the success of expert systems. In spite of the fact that symbolic AI proved to be a good fit for solving logical, formally specified challenges (such as playing chess), it was not able to discover explicit rules for more complex and ambiguous situations.
Providing answers to the problems that could be solved quickly by a human but is hard to articulate in a formalized setting, such as speech analysis and fraud detection, is the real challenge for the AI system. IoT security vulnerabilities can be evaluated and formulated as an issue necessitating a smart approach to safeguard and secure IoT systems against possible attacks.
The large volumes of IoT data generated from the daily interactions among different IoT entities (i.e. human, software, and hardware) devices can be explored and learned using AI algorithms. IoT risks or malicious behaviors can be detected early on in the IoT system's lifecycle by aggregating and analyzing data provided by various IoT sectors, which can then be studied by one or some AI approaches to distinguish regular behaviors from malicious behaviors. AI can also be critical in predicting new IoT risks, which are almost always variations of previous threats because it can intelligently anticipate potential unknown attacks according to the knowledge learned from the existing IoT data. It is also possible that AI might help the IoT system automatically identify the most suited defense mechanism or solutions for diverse threats. There should be a switch in IoT‐based systems to security‐based intelligence rather than just secure communication across IoT parities.
Machine learning (ML) is regarded as a widely applied AI paradigm, and it has achieved notable achievement across a broad range of application domains, including intrusion detection, attack defense mechanisms, authentication, etc. ML is an AI subarea that was proposed as a computer system (playing checkers) that can learn from a significant quantity of historical data by utilizing self‐adapting computing methods. In reality, the complexity of computational algorithms prevents them from adapting to constantly changing system conditions and dynamically changing requirements. Rather than being fully preprogrammed, ML algorithms learn from past experiences, allowing them to assist in prediction as well as decision‐making operations and that may be completed by creating a data‐driven self‐adjusting system [7].
ML is a subfield of AI that emphasizes the programming of computational machines in such a way that they could learn from data.
ML has a slightly more general definition as follows:
Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.
–Arthur Samuel, 1959
ML has a more industrial‐tailored meaning as follows:
A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
–Tom Mitchell, 1997
Why use ML for IoT security?
Think about developing intrusion detection using conventional programming language:
First, you need to know how intrusion is performed and how the attack seems. You might notice that attackers generally look for vulnerabilities to get access without being identified. This can be performed by performing some actions that follow some pattern.
Next, you implement the detection program to identify these malicious patterns and warn you if something is detected.
Then, you test your program by repeating the previous steps many times till it becomes good enough.
Given the complexity of the problem, you can expect your program to become a long list of complicated rules that will be extremely time‐consuming to maintain. On the other hand, ML algorithms can enable the development of a solution to automatically learn the normal patterns of system behavior and so can learn patterns related to intrusions. The program is to the point, simpler to sustain, and relatively more precise than the previous version.
Deep learning (DL) is an ML technology that creates deeper versions of neural networks (NNs) that imitate the composition and functionality of the human brain. Hierarchical learning and deep structured learning are alternative names to DL, which implies stacking a big number of hidden layers that nonlinearly process the data by transforming it into various stages of abstraction aiming to extract the important features and representations [8]. In other words, DL offers a mathematical model that learns to map a set of input data into a particular output by learning the relationship between them.
Although ML and DL share many similarities, they are not mutually exclusive. When the dataset is small and well‐curated, for example, ML could be beneficial because the data has been meticulously prepared. Data preparation necessitates the involvement of a human, which means that ML algorithms will not be able to extract information from huge and complicated datasets and would underfit. ML is sometimes referred to as “shallow learning” due to its ability to learn from minimal datasets. DL shows robust performance even when the dataset size is huge. DL is capable of deducing precise conclusions on its own from any set of data, no matter how complicated the pattern is. Also, it is so powerful that it can handle unstructured data that is not sufficiently coordinated.
DL solutions can be classified into so many different categories according to multiple categorization criteria:
Supervision criterion: It specifies the need for human supervision during the training (supervised, unsupervised, semi‐supervised, and reinforcement learning).
Incrementality criterion: It specifies whether the learning is performed incrementally or on the fly (online versus offline).
Centralization criterion: It specifies whether the learning is centralized or distributed in nature.
Base criterion: It specifies whether the learning is performed in a model‐based or instance‐based strategy.
You can use any of these criteria exclusively or by combining some of them. For example, an intrusion detection system might learn a deep network in a distributed way using normal and attack traffic data instances. This makes it a distributed, supervised, and model‐based system.
DL solutions could be according to the degree and the kind of supervision they use during training. This results in four major categories of methods, namely supervised learning, unsupervised learning, semi‐supervised learning, and reinforcement learning. In the following section, a broad view of each of these categories will be discussed and explained.
Supervised DL simulates the theory of learning in human beings, where the learner is trained and taught under the guidance of the teacher. The distinction is that the student in this case is a deep NN, while labeled training data is given to behave as a supervisor that teaches the network to correctly predict the output. Supervised DL simply receives, as input, the data samples x and the corresponding actual output y (sometimes called ground‐truth), then, train the underlying network automatically learn a mapping function that maps the input x to output y. In other words, the network is trained to learn and discover the inherent patterns and relationships between the input samples and the ground‐truth labels, facilitating it to correctly calculate the outputs of never‐before‐seen data. By the completion of the training procedure, the network is evaluated on an unseen subset of data (test set) by predicting the output of test samples. If the network calculates the correct output, it means that the network is efficient. Therefore, datasets comprising inputs and ground‐truth labels turn out to be important because they assist the deep network to learn in an efficient way.
Many advantages could be gained by supervised learning. It does extremely well at optimizing performance in well‐characterized tasks with a lot of ground‐truth labels. For instance, think about a big dataset of images, where each image was labeled. When the size of the dataset is large as necessary and the training is performed using the appropriate DL models, and with robust enough computers, it would be easy to build a very good, supervised image classifier. As the supervised DL learns from labeled data, it would be possible to evaluate its performance via a loss function by the contrasting predicted label with the actual image label. The DL will clearly seek to reduce this loss function such that the error on never‐before‐seen images from a holdout set is as low as possible [9]. A visual illustration of supervised learning can be found in Figure 1.2.
Generally speaking, a common task for supervised DL is classification. Intrusion detection is an ideal instance of this: it is trained with many IoT traffic flows together with their class (normal or attack), and it should learn in what way to categorize IoT traffic flows. One more typical task is to forecast the numerical value of the target variable, such as the stock price, given a set of features (country, coins, brand, economy, quality, etc.) known as predictors. This kind of task is known as regression. To train a deep network, you should provide it with many historical examples together with the actual value of the target variable (i.e. the previous stock prices).
An attribute is an aspect of an instance (e.g. “brand”). Attributes are also known as features. In supervised learning, a class label is a special attribute that defines the class to which a specific instance belongs. Features have several meanings depending on the context. Several people use the terms “feature” and “attribute” in exchange.
Figure 1.2 Illustration of supervised deep learning.
As the name implies, unsupervised DL denotes the category of deep networks that are not supervised by means of labeled training data. Instead, they seek to discover the hidden patterns and discernments from the underlying unlabeled data. Unsupervised learning simulates the human learning process when the human brain learns new things. Unsupervised learning cannot be explicitly applied to a regression or classification problem since different from supervised learning, the model receives input data with no ground‐truth label. The objective of unsupervised learning is to discover the inherent composition of the dataset, group that data in accordance with relationships, and characterize that dataset in a compacted layout. In other words, algorithms can operate spontaneously in order to learn more about the data and discover remarkable or unanticipated patterns those human beings were not in search of.
Categories: As with ML, two common categories of unsupervised DL exist, namely clustering and association. The former emphasizes the DL models designed to group the data instances into clusters such that instances with the greatest similarities belong to the same cluster, and ones with less or no similarities belong to different groups. The latter category emphasizes the DL models designed to discover the interactions between variables in the big dataset by determining the set of data instances that take place together in the dataset [10]. A visual illustration of unsupervised learning can be found in Figure 1.3.
Figure 1.3 Illustration of the workflow of deep unsupervised learning.
Semi‐supervised learning is a training strategy that combines unsupervised and supervised learning by training the deep network using few amounts of labeled data samples as well as unlabeled data samples. Usually, it is tailored to solve a multitude of practical problems where data labels are hard to find since they involve immense efforts from human annotators, expensive devices, or time‐consuming experiments [11]. Labels could be challenging to get due to the need for human annotators, specialized equipment, or expensive and time‐consuming tests.
Step 1: Train the deep network using the labeled (or supervised) part of the data till it obtains optimal results.
Step 2: Use the supervised trained model (in the previous step) to predict the labels of unlabeled sample data. These generated labels are known as
pseudo labels
.
Step 3: Relate the pseudo labels with the supervised labels in the earlier steps and also relate the corresponding unlabeled samples to the labeled samples.
Step 4: Train the deep network using this new combination of data samples till it reaches optimal performance.
Reinforcement learning (RL) is a branch of ML that allows for automated and goal‐oriented learning through interaction with the environment where the learner (agent) learns from the consequence of its actions, rather than learning what actions to do. The agent does not have to be a software entity, such as you might see in video games, in order to function, instead it could be implanted in any IoT device to implement the concept. It is perhaps the most effective method of properly appreciating and utilizing real‐world experiences because it involves actual interaction with the real world and the receipt of replies [12]