119,99 €
An exploration of connected intelligent edge, artificial intelligence, and machine learning for B5G/6G architecture
Artificial Intelligence for Future Networks illuminates how artificial intelligence (AI) and machine learning (ML) influence the general architecture and improve the usability of future networks like B5G and 6G through increased system capacity, low latency, high reliability, greater spectrum efficiency, and support of massive internet of things (mIoT).
The book reviews network design and management, offering an in-depth treatment of AI oriented future networks infrastructure. Providing up-to-date materials for AI empowered resource management and extensive discussion on energy-efficient communications, this book incorporates a thorough analysis of the recent advancement and potential applications of ML and AI in future networks.
Each chapter is written by an expert at the forefront of AI and ML research, highlighting current design and engineering practices and emphasizing challenging issues related to future wireless applications.
Some of the topics include:
Identifying technical roadblocks and sharing cutting-edge research on developing methodologies, Artificial Intelligence for Future Networks is an essential reference on the subject for professionals and researchers involved in the field of wireless communications and networks, along with graduate and PhD students in electrical and computer engineering programs of study.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 669
Veröffentlichungsjahr: 2024
Cover
Table of Contents
Title Page
Copyright
Dedication
About the Editors
List of Contributors
Acknowledgments
1 Intelligent Beam Prediction and Tracking
1.1 Introduction
1.2 Challenge of Beam Prediction Modeling in Wireless Communications
1.3 Prior Identification – Perspective of Function Space
1.4 Methodology from Stochastic Process
1.5 Stochastic Continuity – Beam Index Difference
1.6 Stochastic Smoothness – Hybrid Data-induced Kalman Filtering
1.7 Beam Width Optimization
1.8 Numerical Results
1.9 Conclusion
References
Notes
2 Signal Detection with Machine Learning
2.1 Introduction
2.2 Symbol Detection
2.3 Modulation Detection
2.4 Source Detection
2.5 Conclusion
References
3 AI-Aided Channel Prediction
Acronyms
3.1 Introduction
3.2 Preliminaries
3.3 Previous Work
3.4 Experimental Evaluations
3.5 Discussion
3.6 Summary
References
4 Semantic Communications
4.1 Introduction
4.2 Semantic Information and Semantic-Native Communication
4.3 Interplay of AI and Semantic Communication
4.4 Conclusion
References
5 Federated Learning for Wireless Communications
5.1 Introduction
5.2 Channel Models
5.3 Federated Learning for Channel Estimation
5.4 FL For Hybrid Beamforming
5.5 Conclusions
Acknowledgment
References
6 Federated Learning in Mesh Networks
6.1 Introduction
6.2 Decentralized Federated Learning
6.3 Mesh Networks
6.4 The Intersection: Decentralized Federated Learning over Mesh Networks
6.5 Solutions
6.6 State-of-the-Art and Noteworthy Implementations
6.7 Future Directions and Open Research Challenges
6.8 Concluding Remarks
References
7 Antenna Design Using Artificial Intelligence
7.1 Introduction
7.2 Evolutionary Algorithms
7.3 Machine Learning
7.4 Knowledge Representation
7.5 Conclusion
References
8 AI-Driven Approaches for Solving Electromagnetic Inverse Problems
8.1 Introduction
8.2 Mathematical Formulation
8.3
AI
-Based
EM–IP
Solution Strategies
8.4 Applications
8.5 Conclusions
Acknowledgments
References
Notes
9 RA-Based RIS-1 Design Using Support Vector Machines to Enhance mmWave 5G Coverage
9.1 Introduction
9.2 RIS-1 Unit-Cell Characterization Using SVR
9.3 RIS-1: Analysis and Optimization
9.4 SVR-Based Design of RIS-1 to Enhance 5G mmWave NF Coverage
9.5 Conclusions and Road Map
References
Note
10 AI at the Physical Layer for Wireless Network Security and Privacy
10.1 Introduction
10.2 Network Security and Privacy Threats and Vulnerabilities
10.3 Fundamentals of AI for Network Security and Privacy
10.4 AI-Driven Physical Layer Security Solutions
10.5 Case Study: UAV-Assisted PLS for Terrestrial Wireless Communications Networks
10.6 Practical Considerations and Challenges of Implementing AI-Based Security Solutions
10.7 Conclusions and Outlook
References
Note
Index
End User License Agreement
Chapter 1
Table 1.1 BPT framework via stochastic bandit.
Chapter 3
Table 3.1 Summary of contributions for the surveyed papers.
Table 3.2 Selected parameters for the simulator.
Table 3.3 System parameters for the training procedure.
Table 3.4 An empirical evaluation of the complexity for each prediction mode...
Chapter 7
Table 7.1 Comparative results of all algorithms for .
Table 7.2 Comparative results of all algorithms for .
Table 7.3 Radiation pattern features for the pattern.
Table 7.4 Positions and amplitudes of the best reconstructed patterns found ...
Chapter 9
Table 9.1 Relative error over the test set and model selection time of the b...
Table 9.2 Relative error over the test set and model selection time of the b...
Table 9.3 Comparison of the compliance for different ripple requirements for...
Table 9.4 Comparison of the compliance for different ripple requirements of ...
Chapter 10
Table 10.1 Identifying and assessing communication network security and priv...
Table 10.2 UAV-enabled prevention, detection, and recovery strategies agains...
Table 10.3 AI applications for UAV-assisted PLS.
Chapter 1
Figure 1.1 Main challenges of BPT in wireless communications.
Figure 1.2 (a) Typical communication scenarios – lines with arrows in part (...
Figure 1.3 Beam direction functions of city block and indoor environments. (...
Figure 1.4 Network structure tailored for beam prediction oriented GP learni...
Figure 1.5 A visualization of SP inference that incorporates the nonparametr...
Figure 1.6 The actions are defined via beam subsets, which may overlap.
Figure 1.7 The principle of the interactive learning design framework.
Figure 1.8 The principle of hybrid data-induced Kalman filtering approach.
Figure 1.9 An illustration of beam direction trajectories of different vehic...
Figure 1.10 The principle of novel data transmission and beam sounding schem...
Figure 1.11 The comparison between the classical beam sounding scheme and th...
Figure 1.12 MUs move within an annulus and the BS is located at the center...
Figure 1.13 The AEASR performance of ESBT (averaged over 300 realizations). ...
Figure 1.14 The AEASR performance of different BPT algorithms – and beam s...
Figure 1.15 The AEASR performance of ESBT and ExSeBT for the two beam switch...
Figure 1.16 The road condition of a typical flyover chosen from the real env...
Figure 1.17 The average EAR performance of different BPT algorithms: Road 1 ...
Figure 1.18 The PSA performance of different BPT algorithms: and Road 1.
Figure 1.19 The PSA performance varying with the beam sounding period for HD...
Figure 1.20 The average accumulative run‐time of different BPT algorithms:
Figure 1.21 The average EAR performance achieved by HDIKF–TS1 models trained...
Chapter 2
Figure 2.1 Trellis diagram of a sequence of five observations, where the hyp...
Figure 2.2 Two toy examples of binary symbol detection where the channel has...
Figure 2.3 Structure of the DNN likelihood estimation.
Figure 2.4 Symbol error rate as a function of the SNR in a channel with inte...
Figure 2.5 Time domain visualization of the modulated signals, starting from...
Figure 2.6 CNN architecture showing the convolution and pooling operations f...
Figure 2.7 Visualization of pooling operation using convolutional kernels.
Figure 2.8 (a) Training and validation losses and (b) training and validatio...
Figure 2.9 Confusion matrix is computed on the test set for the base CNN mod...
Figure 2.10 (a) Loss as a function of epochs and the number of convolution l...
Figure 2.11 Confusion matrix is computed on the test set for the base CNN mo...
Figure 2.12 Music spectrum generated for an eight element array with two sig...
Figure 2.13 Architecture of CNN detector.
Figure 2.14 (a) Mean accuracy as a function of an increasing SINR for the ex...
Figure 2.15 Confusion matrix showing the predictions of CNN detector in an e...
Figure 2.16 Architecture of a ResNet block demonstrating residual learning....
Figure 2.17 Accuracy vs. SINR for the experiment involving 0–5 correlated so...
Chapter 3
Figure 3.1 An illustration of how the propagation environment generates the ...
Figure 3.2 Architecture of the channel predictive feedback system to the ada...
Figure 3.3 A visualization of the mapping from input to output in a channel ...
Figure 3.4 Architecture of an MLP, from input to output.
Figure 3.5 Calculation of the elements for each feature map at layer , usin...
Figure 3.6 Architecture of a complete CNN. In this particular illustrative e...
Figure 3.7 An illustration of how the LSTM is designed. The small rectangles...
Figure 3.8 An illustration of how the LSTM is designed. The small rectangles...
Figure 3.9 A flowchart of the transformer model.
Figure 3.10 A sequence of the simulated channel in the complex plane for the...
Figure 3.11 An example of how the abrupt changes of the noise variance were ...
Figure 3.12 The MSE of the GRU predictor as a function of the previous numbe...
Figure 3.13 The prediction error of the simulated noise-free channel, as a f...
Figure 3.14 Prediction error of the channel is distorted by additive Gaussia...
Figure 3.15 An example of how the predicted channel relates to the true simu...
Figure 3.16 The performance of the predictive models was measured by the MSE...
Chapter 4
Figure 4.1 Semantic communication paradigm.
Figure 4.2 Application of semantic communication.
Figure 4.3 An illustrating schematic of a semantic source and its loss compr...
Figure 4.4 Example of common and emergent knowledge for a single entity.
Figure 4.5 Example of semantics on simplicial complex.
Figure 4.6 Example of machine-learning-powered semantic communication for re...
Figure 4.7 Example of semantic-native collective intelligence.
Chapter 5
Figure 5.1 The training and testing phases for (a) centralized learning (CL)...
Figure 5.2 The outline of the chapter.
Figure 5.3 For a single user (, ) situated in the far-field (left) and t...
Figure 5.4 (a) Training data collection and (b) channel estimation with the ...
Figure 5.5 Validation RMSE (a) and channel estimation NMSE (b) with respect ...
Figure 5.6 Validation RMSE (a) and channel estimation NMSE (b) with respect ...
Figure 5.7 Channel estimation NMSE for different algorithms in massive MIMO ...
Figure 5.8 RIS-assisted mmWave massive MIMO scenario.
Figure 5.9 Validation RMSE (a) and channel estimation NMSE (b) with respect ...
Figure 5.10 THz wideband channel estimation NMSE vs. SNR. , GHz, and ...
Figure 5.11 FL-based beamforming: Validation accuracy with respect to .
Figure 5.12 Spectral efficiency vs. .
Chapter 6
Figure 6.1 Outline of this chapter.
Figure 6.2 Traditional federated learning and decentralized federated learni...
Figure 6.3 A diagrammatic representation of a mesh network.
Figure 6.4 A diagrammatic representation of DFL over mesh networks.
Chapter 7
Figure 7.1 A classification of AI techniques in the antennas domain.
Figure 7.2 A population of 15 frogs distributed in 3 memeplexes.
Figure 7.3 Symmetrically placed linear array geometry.
Figure 7.4 Synthesis of a -element pattern with SLL dB with . Boxplot of...
Figure 7.5 Synthesis of a -element pattern with SLL dB with . Boxplot of...
Figure 7.6 Synthesis of a -element pattern with SLL dB with . Convergenc...
Figure 7.7 Synthesis of a -element pattern with SLL dB with . Convergenc...
Figure 7.8 Synthesis of a -element pattern with SLL dB with and . Radi...
Figure 7.9 Use of an ML algorithm in antenna synthesis.
Figure 7.10 Geometry of the proposed modified E-shaped patch antenna.
Figure 7.11 Statistical distribution of the difference between estimated and...
Figure 7.12 3D radiation patterns of the best obtained result at (a) GHz, ...
Figure 7.13 Surface current distribution of the best obtained result at (a)
Figure 7.14 Reflection coefficient ( magnitude) versus frequency of the bes...
Chapter 8
Figure 8.1 Classification of
EM
imaging types/steps.
Figure 8.2 Geometry of the
3D MI
problem.
Figure 8.3 Schematic representation of the training set generation exploitin...
Figure 8.4 Block scheme of the
SbD
-driven solution framework for
EM
imaging ...
Figure 8.5 Possible
LBE
-based strategies for brain stroke imaging exploiting...
Chapter 9
Figure 9.1 Schematic representation of a printed RA.
Figure 9.2 Application scenarios for RA-based RIS illuminated from a nearby ...
Figure 9.3 RA cells used for passive RIS: (a) rectangular patches.
Figure 9.4 Example of an RA unit cell based on parallel and coplanar dipoles...
Figure 9.5 Black-box system model of SVR applied on an RA unit cell.
Figure 9.6 1D geometrical interpretation of SVR parameter and its relation...
Figure 9.7 Illustration of the -insensitive loss function and the correspon...
Figure 9.8 Relative error (dB) in estimating for an incident angle over ...
Figure 9.9 Training time in logarithmic units for each point of the grid s...
Figure 9.10 Relative error (dB) in estimating for oblique incidence () ov...
Figure 9.11 (a) Sketch of a RIS used in an outdoor scenario to enhance cover...
Figure 9.12 Element of the RIS modeled as a four-port network.
Figure 9.13 Scheme of an arbitrary aperture.
Figure 9.14 Reference system to compute the contribution of a single subaper...
Figure 9.15 Flowchart of the classical intersection approach.
Figure 9.16 Example of a RIS element based on a patch topology considering t...
Figure 9.17 Sketch of the generalized intersection approach integrating the ...
Figure 9.18 Projection of the field (set ) onto the set of valid radiated...
Figure 9.19 Sketch of the proposed deployment scenario: (a) azimuth and (b) ...
Figure 9.20 (a) Upper view showing the periodic structure of the RA-based RI...
Figure 9.21 Simulated phase and amplitude of the reflection coefficient at s...
Figure 9.22 Evolution of the model selection time, (s), and of the relativ...
Figure 9.23 Comparison of the magnitude (dB) and phase (°) of and over 3...
Figure 9.24 Comparison of the magnitude (dB) and phase (°) of and over 3...
Figure 9.25 Comparison of the magnitude (dB) and phase (°) of and over 3...
Figure 9.26 Phase-shift distribution on the RIS surface obtained with the an...
Figure 9.27 Near-field pattern (dB) (normalized to its maximum) radiated by ...
Figure 9.28 Phase-shift distribution on the RIS surface obtained after the m...
Figure 9.29 Near-field pattern (dB) (normalized to its maximum) radiated by ...
Figure 9.30 Comparison of main cuts of the NF pattern at in (a) azimuth (b...
Figure 9.31 Length distributions (in mm) obtained after the design process o...
Figure 9.32 Comparison of main cuts of the NF pattern at in (a) azimuth (b...
Figure 9.33 ML-based RIS development road map.
Chapter 10
Figure 10.1 The AI algorithms classifications with their network security an...
Figure 10.2 System model for UAV-assisted PLS simulation scenario.
Figure 10.3 The proposed DQN block diagram.
Figure 10.4 DQL algorithm for optimizing DL power for UAV-assisted PLS.
Cover
Table of Contents
Title Page
Copyright
Dedication
About the Editors
List of Contributors
Acknowledgments
Begin Reading
Index
End User License Agreement
ii
iii
iv
v
xv
xvi
xvii
xviii
xix
xx
xxi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
IEEE Press445 Hoes LanePiscataway, NJ 08854
IEEE Press Editorial BoardSarah Spurgeon, Editor in Chief
Moeness Amin
Jón Atli Benediktsson
Adam Drobot
James Duncan
Ekram Hossain
Brian Johnson
Hai Li
James Lyke
Joydeep Mitra
Desineni Subbaram Naidu
Tony Q. S. Quek
Behzad Razavi
Thomas Robertazzi
Diomidis Spinellis
Edited by
Mohammad A. Matin
Department of Electrical and Computer Engineering
North South University
Bangladesh
Sotirios K. Goudos
Department of Physics
Aristotle University of Thessaloniki
Greece and the Director of the ELEDIA@AUTH lab member of the ELEDIA
Research Center Network
George K. Karagiannidis
Department of Electrical and Computer Engineering
Aristotle University of Thessaloniki
Greece and the Head of the Wireless Communications and Information
Processing (WCIP) Group
Copyright © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data Applied for:
Hardback ISBN: 9781394227921
Cover Design: WileyCover Image: © zf L/Getty Images
Mohammad A. Matin dedicates the book to his wife Momtaz Begum who inspires him and their sons Zabeer Ahmed and Zawad Ahmed.
Sotirios K. Goudos dedicates the book to his wife Athena who inspires him and their daughters Mary and Mandy.
George K. Karagiannidis dedicates the book to Kostis.
Mohammad A. Matin (Senior Member, IEEE) is a professor in the Department of Electrical and Computer Engineering at North South University (NSU). He has published over 150 peer-reviewed journal and conference papers. He is the author/editor of 18 academic books and 22 book chapters. He has received several prizes and scholarships including the Best Student Prize (Loughborough University), Commonwealth Scholarship, and Overseas Research Scholarship (ORS) conferred by the Committee of Vice Chancellors and Principals (CVCP) in the United Kingdom. He is currently serving as a member of the editorial boards for several international publications including IEEE Communications Magazine, IET Wireless Sensor Systems.
Sotirios K. Goudos (Senior Member, IEEE) is a professor in the Department of Physics at the Aristotle University of Thessaloniki and the director of the ELEDIA@AUTH lab member of the ELEDIA Research Center Network. He has published and presented more than 350 technical papers in scientific journals and international conferences. He is the founding editor-in-chief of the Telecom open-access journal (MDPI publishing). He is currently serving as associate editor for IEEE Transactions on Antennas and Propagation, IEEE ACCESS, and IEEE Open Journal of the Communication Society. He was honored as an IEEE Access Outstanding Associate Editor for 2019, 2020, 2021, 2022, and 2023.
George K. Karagiannidis (Fellow, IEEE) is currently a professor in the Department of Electrical and Computer Engineering of Aristotle University of Thessaloniki, Greece, and head of the Wireless Communications & Information Processing (WCIP) Group. He has published and presented more than 700 technical papers in scientific journals and international conferences. He was the past editor of several IEEE journals, and from 2012 to 2015, he was the editor-in-chief of IEEE Communications Letters. From September 2018 to June 2022, he served as associate editor-in-chief of the IEEE Open Journal of Communications Society. Currently, he is the editor-in-chief of IEEE Transactions on Communications. He received three prestigious awards: The 2021 IEEE ComSoc RCC Technical Recognition Award, the 2018 IEEE ComSoc SPCE Technical Recognition Award, and the 2022 Humboldt Research Award from the Alexander von Humboldt Foundation. He is one of the highly cited authors across all areas of electrical engineering, recognized from Clarivate Analytics as a Highly Cited Researcher in the nine consecutive years 2015–2023.
Aly S. Abdalla
Department of Electrical and Computer Engineering
Mississippi State University
Mississippi State, MS
USA
Manuel Arrebola
Department of Electrical Engineering Group of Signal Theory and Communications
Universidad de Oviedo
Gijon, Asturias
Spain
Mehdi Bennis
Centre for Wireless Communications
University of Oulu
Oulu
Finland
Yuanzhu Chen
School of Computing
Queen’s University
Kingston, Ontario
Canada
Christos Christodoulou
Department of Electrical and Computer Engineering
University of New Mexico
Albuquerque, New Mexico
USA
Merouane Debbah
KU 6G Research Center
Khalifa University of Science and Technology
Abu Dhabi
United Arab Emirates
Octavia A. Dobre
Faculty of Engineering and Applied Science
Memorial University, St. John’s
Newfoundland and Labrador
Canada
Ahmet M. Elbir
Department of Electrical and Electronics Engineering
Istinye University
Istanbul
Turkey
Carlo Fischione
School of Electrical Engineering and Computer Science
KTH Royal Institute of Technology
Stockholm
Sweden
Gábor Fodor
School of Electrical Engineering and Computer Science
KTH Royal Institute of Technology
Stockholm
Sweden
and
Radio Network Algorithms
Ericsson Research, Ericsson AB
Kista
Sweden
Sotirios K. Goudos
Department of Physics
Aristotle University of Thessaloniki
Greece
and
Director of the ELEDIA@AUTH
lab member of the ELEDIA Research Center Network
Arjun Gupta
Department of Electrical and Computer Engineering
University of New Mexico
Albuquerque, New Mexico
USA
Yongming Huang
National Mobile Communications Research Laboratory
Southeast University
Nanjing
China
George K. Karagiannidis
Department of Electrical and Computer Engineering
Aristotle University of Thessaloniki
Greece
and
Head of the Wireless Communications and Information Processing (WCIP) Group
Maokun Li
Department of Electronic Engineering
Beijing National Research Center for Information Science and Technology (BNRist)
Tsinghua University
Beijing
China
and
ELEDIA Research Center (ELEDIA@TSINGHUA – Tsinghua University)
Beijing
China
Jesús A. López-Fernández
Department of Electrical Engineering
Group of Signal Theory and Communications
Universidad de Oviedo
Gijon, Asturias
Spain
Vuk Marojevic
Department of Electrical and Computer Engineering
Mississippi State University
Mississippi State, MS
USA
Eduardo Martinez-de-Rioja
Department of Signal Theory and Communications and Telematic Systems and Computing
Universidad Rey Juan Carlos
Fuenlabrada, Madrid
Spain
Manel Martínez-Ramón
Department of Electrical and Computer Engineering
University of New Mexico
Albuquerque, New Mexico
USA
Christos Masouros
Department of Electronic & Electrical Engineering
University College London
London
UK
Andrea Massa
ELEDIA Research Center (ELEDIA@UniTN – University of Trento)
DICAM – Department of Civil
Environmental, and Mechanical Engineering
Trento
Italy
and
ELEDIA Research Center (ELEDIA@TSINGHUA – Tsinghua University)
Beijing
China
and
ELEDIA Research Center (ELEDIA@UESTC – UESTC)
School of Electronic Science and Engineering
Chengdu
China
and
School of Electrical Engineering
Tel Aviv University
Tel Aviv
Israel
Mohammad A. Matin
Department of Electrical and Computer Engineering
North South University
Dhaka
Bangladesh
Marco Salucci
ELEDIA Research Center (ELEDIA@UniTN – University of Trento), DICAM – Department of Civil, Environmental, and Mechanical Engineering
Trento
Italy
Wei Shi
School of Information Technology
Carleton University
Ottawa, Ontario
Canada
Oscar Stenhammar
School of Electrical Engineering and Computer Science
KTH Royal Institute of Technology
Stockholm
Sweden
and
Radio Network Algorithms
Ericsson Research, Ericsson AB
Kista
Sweden
Bo Tang
Department of Electrical and Computer Engineering
Worcester Polytechnic Institute
Worcester, MA
USA
Álvaro F. Vaquero
Department of Mathematics
Universidad de Oviedo
Oviedo, Asturias
Spain
Jayakrishnan Vijayamohanan
Department of Electrical and Computer Engineering
University of New Mexico
Albuquerque, New Mexico
USA
Xu Wang
School of Computing
Queen’s University
Kingston, Ontario
Canada
Jianjun Zhang
College of Computer Science and Technology
Nanjing University of Aeronautics and Astronautics
Nanjing
China
Qiyang Zhao
Technology Innovation Institute
Abu Dhabi
United Arab Emirates
Hang Zou
Technology Innovation Institute
Abu Dhabi
United Arab Emirates
We express our sincere thanks to the authors of different chapters, whose invaluable contributions helped us in completing this book. The editors are grateful to numerous reviewers for their helpful recommendations and comments during the review process. We are also grateful to the Wiley team, especially Sandra Grayson, Becky Cowan, Kavipriya Ramachandran, Sathishwaran Pathmanabhan, and Kavin Shanmughasundaram, for their kind assistance during this endeavor.
Best regards,
Mohammad A. Matin
North South University, Bangladesh
Sotirios K. Goudos
Aristotle University of Thessaloniki, Greece
George K. Karagiannidis
Aristotle University of Thessaloniki, Greece
Christos Masouros1, Jianjun Zhang2, and Yongming Huang3
1Department of Electronic & Electrical Engineering, University College London, London, UK
2College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China
3National Mobile Communications Research Laboratory, Southeast University, Nanjing, China
Introduction
Challenge of Beam Prediction Modeling in Wireless Communications
Prior Identification – Perspective of Function Space
Methodology from Stochastic Process
Stochastic Continuity – Beam Index Difference
Stochastic Smoothness – Hybrid Data-induced Kalman Filtering
Beam Width Optimization
Numerical Results
Conclusion
Because of abundant spectrum resources at high-frequency band, that enable to achieve ultrahigh-speed data transmission (DT), high-frequency communications, e.g., millimeter wave or even Terahertz communications, have attracted extensive interest from academia, industry, and government [1]. For high-frequency communications, the transmitter and/or receiver are often equipped with large-scale antenna arrays, i.e., massive multiple-input multiple-output (MIMO), to achieve high array gains to overcome signal attenuation of high-frequency band. However, the use of pencil-like highly directional beams makes channel state information (CSI) acquisition via beam alignment (BA) very challenging. First, acquiring CSI in a mobile network is particularly challenging since the wireless channel often varies rapidly. Second, in contrast to the fully digital transceiver, where the pilot transmission scheme can be utilized to acquire CSI [2, 3], channel estimation is more complicated in the hybrid antenna array architecture, which has been used widely in practice, because we cannot extract the actual received signals on all antennas simultaneously. Last but not the least, the large dimension of massive MIMO inevitably results in large and even unaffordable pilot overhead, even if the pilot transmission-based method can be used.
To tackle this challenging issue, the two-step precoding and combining based scheme is widely used in practical systems, e.g., standardized in IEEE 802.11ad/802.15.3c [4, 5]. Let , , and represent the precoding matrix, channel matrix, and combing matrix, respectively. The precoding (combining) matrix is assumed to be decomposed as (), where and ( and ) denote the analog and digital parts, respectively. To reduce the dimension of the CSI estimate problem, beam training is first performed between the transmitter and receiver to obtain the analog precoder and combiner . Then, the effective or equivalent channel can be estimated, based on which the digital parts, i.e., the precoder and combiner , can be designed in the analog domain via a variety of methods, e.g., the heuristic methods or optimization-based algorithms [6–8]. Note that since the size of the effective channel matrix is much smaller than that of the original channel matrix , the pilot overhead in the second step is relatively low. It is observed that the remaining difficulty lies in how to design an efficient beam training scheme to find the optimal and .1
Initially, beam training (also referred to as beam sounding) is implemented via the exhaustive and hierarchical search [5, 9, 10]. Compared to the exhaustive search, whose sounding overhead is with denoting the size of the training codebook, the sounding overhead of the hierarchical search is for the typical binary tree search-based implementation, which is smaller than that of the exhaustive search scheme. For this reason, along with the advantage of easy implementation, the hierarchical search-based scheme has been adopted in several IEEE standards, such as IEEE 802.15.3c and IEEE 802.11aj. Note the performance of the hierarchical search-based algorithms heavily depends on the codebook used. In fact, besides the demand for multi resolution, namely, various widths of main lobes, other properties, such as flat main lobe and side lobe, narrow transition band, and high-power efficiency of power amplifier, are also very important and should be well addressed [9, 11]. In general, the research on hierarchical search often boils down to sounding codebook design [5, 9–14].
The advantage of the exhaustive or hierarchical search-based methods is that they can be applied to an arbitrary scenario because they are nonadaptive methods and thus independent of external environments. However, the beam sounding overhead is almost always very large, especially for a large-scale antenna array and/or a rapidly changing environment. In fact, on the one hand, as the scale of the antenna array increases, the beam width decreases accordingly, which thus increases the sounding overhead. On the contrary other hand, the coherence time or period becomes shorter in a rapidly fluctuating environment. Hence, much of the precious time resource is spent on beam sounding, and the proportion of time resources used for DT is very small. This phenomenon is particularly pronounced for the highly varying communication scenarios, e.g., unmanned aerial vehicle (UAV) communication.
To avoid frequent searches, beam tracking is invoked to reduce the sounding overhead. The complete process of BA operation in a relatively long time consists of two phases. First, initial BA is performed in the first stage to find the optimal beam or beam pair via the exhaustive or hierarchical search, which involves a large beam sounding overhead, as mentioned before. Then, the beam tracking technique is invoked in the second phase to enable efficient search. Compared to the initial BA, the number of beams used for tracking is relatively small, e.g., maybe only one beam is used for sounding. Note that if the tracking fails, which is inevitable, the initial BA operation is invoked again to reinitialize the beam tracking.
The key to beam tracking is beam prediction, i.e., to predict a beam subspace that contains the real beam. In practice, two types of metrics are closely related to beam prediction. The first one is the success rate and prediction efficiency, i.e., the beam subspace predicted should contain the real optimal beam, and meanwhile, the beam subspace should be as small as possible. The second one is the complexities of beam prediction, including both sample complexity and inference complexity. To balance these indicators, various methods have been proposed, the core of which is to exploit temporal and spatial correlations of wireless channels. The most important step toward beam prediction is to construct an appropriate prediction model. Overall, there are mainly two ways to construct a prediction model, i.e., the traditional manual fashion and the recent automatic fashion. The classical and representative manual method to construct a prediction model is the Kalman filtering- or Bayesian filtering-based Beam prediction and tracking (BPT) algorithms [15–22]. Machine learning (ML) methods are used to automatically construct prediction models, typically, in the data-driven manner [23–29].
The Kalman filtering and Bayesian filtering methods address the issue of prediction model construction by building a dynamical model that characterizes the underlying physical system. Specifically, two stochastic differential equations (SDEs), referred to as state-space and measurement equations in literature, are first established. As long as the two SDEs are available, the well-known Kalman filter or Bayesian filter can be invoked to perform real-time inference or prediction. For example, both the extended Kalman filter- and Bayesian filter-based beam tracking algorithms are proposed in Liu et al. [15] and Yuan et al. [18] for the dual-functional radar and communication systems. For the distributed millimeter-wave massive MIMO problem, a monopulse beam tracking method based on the unscented Kalman filter is designed in [21], which shows to achieve good robustness as well as generalization ability.
An important and appealing advantage of the Kalman filtering based methods is that they have low computational complexity. In particular, the scaling of computational complexity for the Kalman filter is linear (where is the number of samples), as opposed to the cubic scaling for Gaussian process (GP) regression-based BPT algorithms. Note that this advantage is attributed to the fact that the underlying state-space system model is exploited. However, since the prediction model is obtained via manual derivation manner, it may fail in complicated scenarios or environments. To tackle this issue, recently a novel hybrid model and data-driven-based approach, referred to as hybrid data-induced Kalman filtering (HDIKF), has been proposed by Zhang et al. [30, 31].
In contrast to the Kalman filtering-based designs, ML typically addresses the issue of prediction modeling by employing the data-driven mode. In fact, it is well known that a powerful ability of ML is that it can automatically extract meaningful patterns and further derive directly an appropriate model from the observed data. According to the underlying ML theory and methods, the ML-based beam prediction methods fall into two categories, i.e., the (deep) reinforcement learning (RL)-based algorithms [26–29, 32, 33] and supervised learning (SL)-based algorithms [13, 23–25, 34–38].
The RL-based BPT solutions can further fall into two subcategories, including the state-free RL-based solutions [26–29] and RL-based solutions incorporating state design [32, 33]. The state-free algorithms, which are mainly implemented via stochastic bandits, cannot characterize and exploit complex environment information, which may fail to exploit useful spatial or other channel correlations. In contrast, the RL-based solutions that characterize the channel correlation, variation, and other information via dedicated state designs when formulating the BPT problem as an RL counterpart can achieve better performance in complicated scenarios. Compared to other ML-based methods, a salient advantage of the RL-based methods is that they can collect training samples via interacting with the environment, which facilitates an online implementation [33]. However, since the mathematical foundation of RL is the Markov decision process, they have a low convergence rate, which may fail to achieve good performance in the short term.
Another major category of ML-based beam prediction solutions is the SL-based algorithms [13, 23–25, 34–38]. The SL-based BPT solutions circumvent the difficulty of manually building models, thanks to the data-driven design paradigm. But it should be noted that one drawback of the SL-based solutions is that the number of training samples required is often very large so as to achieve satisfactory performance. Moreover, rapidly fluctuating wireless environments can also invalidate the deterministic prediction models established. Among the SL-based algorithms, GP regression-based algorithms have attracted considerable attention [35–38], thanks to the advantages of nonparametric modeling and uncertainty calibration. Unfortunately, the scaling of their computational complexity is typically . Since the wireless channels vary quickly and can be very large, the large complexity becomes an obstacle to real-time applications.
We have already briefly outlined the main BPT techniques and methods, from the conventional nonlearning solutions to the state-of-the-art learning-based methods. We have also briefly analyzed the pros and cons of different BPT methods. In the rest of this chapter, we will first highlight the difference between generic ML applications and prediction modeling in wireless communications. Then, from the perspective of the modern stochastic process (SP), we will propose a novel design methodology whose core is to identify and exploit the analytic properties (e.g., stochastic continuity and stochastic smoothness) of the sample functions of the underlying beam process. Based on the methodology, we further design several efficient BPT solutions, all of which enjoy good small sample performance, low computational complexity and robustness to environment variation. To further enhance system performance, we present an important technique closely related to BPT, namely, beam width optimization (BWO) and propose several efficient ML-based algorithms. Finally, the simulation results are provided to evaluate different BPT techniques.
Although tremendous ML prediction models and algorithms have been developed in the past two decades in various fields, such as computer vision, natural language processing, and machine translation, this does not mean that the problem of beam prediction modeling can be easily addressed. In fact, because of the characteristics of wireless communications, the ML models and algorithms practicable are very limited, and this problem is far from being resolved. Next, we highlight and analyze the differences and challenges of beam prediction modeling in wireless communications via comparing them with the typical ML applications or tasks. As illustrated in Figure 1.1, the main challenges are abbreviated as USMC, i.e., various types of uncertainties, small-sample performance, prediction modeling, and training/inference complexities.
Figure 1.1 Main challenges of BPT in wireless communications.
Uncertainty (U): Roughly speaking, the uncertainties in beam prediction stem from epistemic uncertainty and aleatoric uncertainty. On the one hand, both the formation and variation mechanisms of beam direction are very complex, which results in epistemic uncertainty. On the other hand, due to the rapid fluctuation of wireless channel environments, the available training samples are very limited, which further increases the epistemic uncertainty. The aleatoric uncertainty is derived from the natural randomness inherent in a beam prediction task. In practice, the training samples inevitably contain noise, which thus requires us to build a prediction model based on the noisy samples. For example, because the resolution of an antenna array is limited, it is impossible to obtain an accurate beam direction. Similarly, when we make predictions in the phase of inference, we also need to consider the uncertainty.
Small-sample (S): In contrast to many ML applications, e.g., image recognition, where enormous training images are available with the help of Internet, it is often expensive to collect training samples for a wireless communication ML task. As a result, a few public datasets can be used by the researchers. What is harder is that the wireless communication system is time-varying, which implies that the collected training samples and the trained prediction model can be easily outdated. This is different from a typical image recognition task. Therefore, the small-sample performance is very desired and important in wireless communications, which excludes many powerful but data-hungry prediction models and algorithms.
Modeling (M): The modeling challenge is three-fold. First, as mentioned earlier, the time-varying characteristic of wireless communications makes the ML model obtained outdated easily. Second, the time-varying feature also increases the difficulty of model training. Besides the challenge of sample collection, the task of training a new model has to be completed in a very short time, which requires more computing resources. Third, modeling in wireless communications must take into account the issue of scalability. In fact, scalability is a fundamental (and also essential and important) feature of wireless communications, which, however, does not exist in most other ML applications. Specifically, due to various factors and/or demands, e.g., the design goal of maximizing system transmission rate, the need of meeting the quality of service of some users or just caused by a system scheduling algorithm, the prediction model setting, e.g., the dimension of input or output layer of a prediction model, is time-varying, which greatly increases the challenge of ML modeling.
Complexity (C): As a data-driven paradigm, building ML models often requires huge computing and storage resources, such as graphics processing unit (GPU). However, in contrast to classical ML applications, where abundant computing and storage resources are available, most mobile communication systems are often resource-constrained, caused by the power supply, device cost, etc. Moreover, because of the time-varying feature, the prediction model has to be updated at a high frequency, and thus, the real-time requirement is much higher than other ML tasks. All these factors determine that the complexities of model learning and online inference, in terms of both sample complexity and computation complexity should be sufficiently low, which may be in conflict with the deep learning model having lots of trainable parameters. When it comes to considering uncertainty, it becomes harder.
In a nutshell, due to the above reasons, it is nontrivial to design efficient prediction models and training and inference algorithms for a wireless communication system. We have to take into account at least a part of the above factors. Next, we address and alleviate the issues by identifying and exploiting an important type of nonparametric priors existing extensively in many practical communication systems.
To address these issues, first and foremost, we shall tackle the small-sample challenge since as the number of samples decreases, the other challenges can be greatly alleviated. Note that the training samples critically depend on environments, which could be very expensive to obtain even a small amount of them. Therefore, it is of great importance to identify and exploit the priors from both the communication system and external environment. To this end, an effective and systematic approach is proposed to identify and exploit the priors. In this section, we elaborate on the theoretical part of this approach [39]. Later, specific examples are presented to show how to use this approach to design efficient algorithms.
The existing BPT solutions mainly exploit the probabilistic properties of beam directions or beam direction variations. For example, the RL-based algorithms essentially exploit the transition probability of beam directions. However, the beam directions or beam direction variations also have rich and useful analytic properties, which unfortunately have been ignored in the past BPT designs. If they can be well identified and exploited, the desired small-sample performance can be achieved.
To identify and exploit these analytic properties, we need to inspect the beam directions and beam direction variations from the perspective of function space. Specifically, we first adopt a dynamic view by considering the discrete beam directions as the sampling points of a continuous-time or -space function, which is referred to as beam direction function. Second, we analyze, characterize, and model the beam direction function. Note that the beam direction function is essentially random, which can be regarded as an outcome sampled from a set of functions, i.e., function space. For a practical scenario, due to various factors, e.g., specific deployments or surroundings of base stations or mobile users, the function space is endowed with a probability structure that characterizes the randomness.2 Finally, we attach equal importance to both the analytic properties and probabilistic properties.
Mathematically, a function space endowed with a probability structure, in fact, constitutes an SP. In literature, an SP is described as a collection of random variables , all defined on a common probability space , where is the sample space, is a -algebra on , and is a probability measure on . is often an interval of the real line, which enables to think of the process as evolving in time almost irresistible. The random variable depends on both and the point in at which it is evaluated. As a function of two variables, it is written as . For a fixed , is a measurable mapping from into .
The above definition and interpretation of SP, which underlie most BPT solutions, however, fail to identify and exploit important prior information. In contrast, we emphasize particularly on the second perspective of SP, i.e., the perspective of the sample path of SP, which attracted people’s attention and became popular due to Kolmogrov’s work. Specifically, for a fixed , the function is called a sample path of the SP. If all the sample paths lie in some fixed collection , the process can be thought of as a map from into , i.e., a random element of . Based on this perspective, each sample path of is a single point in . For example, if a process indexed by has continuous sample paths, it defines a random element of the classical function space of all continuous and real-valued functions on .
The motivation and advantage that SP is utilized to characterize and describe beam direction functions are twofold. First, many important priors, which cannot be exploited by conventional and even learning-based methods, can be identified. Second, efficient BPT algorithms can be derived based on this perspective. Before proceeding, several typical examples, and the priors that cannot be exploited by existing methods, are provided to intuitively illustrate our perspective.
In practice, thanks to specific deployments (e.g., high-speed train and motorway), specific and/or limited movement speeds (e.g., the low speed of human beings while typical speeds on motorway), surroundings of base stations and mobile users (e.g., buildings and trees), electromagnetic environments, radio spectrum (e.g., Sub-6G or Terahertz), weather (to affect high-frequency communications), etc., there exist a variety of useful priors. Moreover, for a specific scenario (Figure 1.2), some factors are dominated, while the others are marginal, which can thus further strengthen and highlight a specific prior.
In practice, the base stations are deployed along the rails at regular distances, and the velocities of high-speed trains are within a reasonable range [40]. As a result, the beam directions observed by multiple base stations present an inexact periodicity [39].
A typical example is a mobile user moving randomly within a room. As a consequence, it is difficult to predict the beam for the next time slot accurately because of the randomness. But fortunately, the difference in beam directions at two adjacent time slots is small, as shown in Figure 1.3a.
A typical scenario is considered, where a user moves along a street and maybe changes to another one or occasionally stops for a moment. It can be observed from Figure 1.3b that the beam direction function is flat in some intervals. Because of the blockage, it may also be discontinuous.
As mentioned earlier, to identify these priors, we need to adopt the Kolmogrov’s perspective of modern SP, i.e., beam directions within an interval are regarded as the samples of a random function , and more importantly, each single function is further regarded as a random realization of an SP, which is referred to as beam process and denoted by – the (function) space of sample paths.3 We investigate and exploit two types of important properties or priors, i.e., sample path property and statistical regularity property.
Figure 1.2 (a) Typical communication scenarios – lines with arrows in part (b) represent motion trajectories.
Figure 1.3 Beam direction functions of city block and indoor environments. (a) Beam directions of indoor users and (b) beam directions of pedestrians.
Sample path property (or prior)
: A sample path property is referred to the analytic property of random sample path , whose existence is as follows. For example, due to the limited movement speed as well as slow-varying environment, is a continuous function, although it may be fairly random, e.g., it behaviors like the Brownian motion. This sample-path property is referred to as beam change continuity. Another property or prior is that since the movement speed of a mobile user is limited and beam sounding is operated with an appropriate frequency, the jump or variation of is bounded, although can be very complex, e.g., is discontinuous due to the blockage effect. This sample-path property is referred to as the bounded jump or variation property.
Statistic regularity property (or prior)
: Although a single sample path, e.g., , may be too random to predict its behavior, a large number of sample paths will present some regularity. This property is referred to as the statistic regularity (SR). In fact, for a specific scenario, i.e., the high-speed train communication, a set of sample paths can be regarded as multiple realizations of a probability distribution, i.e., each is sampled as per a common distribution. The property related to the probability distribution is, in fact, the SR property or prior. The existence of SR is due to the fact that the surrounding environment is relatively stable.
The two types of properties are also observed from different timescales. Specifically, since the SR property can only be obtained from multiple sample paths or long-term observations, it belongs to long-term property or prior. In contrast, as the sample path property may be represented based on a single sample path or short-term observations, it therefore belongs to short-term property or prior. For the above scenarios, the beam direction functions have the aforementioned properties. For example, the beam direction function of an indoor environment has the property of beam continuity. In some cases, there is no obvious SR seen from a small number of sample paths, it presents significant SR if many sample paths are available.
It is not difficult to understand that although these priors are useful and exist extensively in practice, it is challenging to incorporate them into the existing frameworks or models. The reason for this is twofold. On the one hand, they involve considerable degrees of randomness. On the other hand, the dimension of function space underlying these priors is infinite. The two factors invalidate many deterministic and/or parametric techniques. Compared to the prior identification, we are more concerned with the use of these priors to improve efficiency, e.g., to reduce the sounding overhead.
The perspective of function space not only provides useful insights but also offers efficient methodologies to model and characterize the beam process and predict the beam direction. As mentioned before, a mathematical object incorporating both function (or analytic) structure and probabilistic structure is the SP. Next, we further provide two methodologies to model the beam process and predict beam direction, i.e., SP theory and SDE. We first investigate the beam modeling methodology based on the SP theory, which is both natural and intuitive as per the function space view.
There are various SPs, such as the Poisson process, Markov chain, and Brownian motion. As an example, we take the GP to illustrate the methodology. In practice, GP is also preferred by researchers and engineers. The reason for this is twofold. First, compared with many SPs, GP has explicit and analytical conditional density expression, which can avoid inference intractability. Second, GP is very flexible, which can be further extended and enhanced via kernel design and deep neural networks (DNNs). A GP describing beam directions is denoted by . Since the mean function of GP is often assumed to be zero function, a GP is characterized completely by the covariance function (also referred to as kernel), which is defined as
Refer to Rasmussen and Williams [41] for more details of GP.
The key to GP modeling is that the nonparametric prior is encoded into the GP kernel. We still take high-speed train communication as an example to illustrate how to realize this goal [39]. In view that: (i) the beam direction of the high-speed train has the properties of inexact periodicity and continuity; and (ii) the spectral mixture kernel and squared exponential kernel , respectively, encode periodicity [42] and continuity [41], the GP kernel is designed as a linear combination of the two kernels (along with a noise term) i.e.,
Since the representative ability of a simple kernel is limited and the selection or design of the kernel is heuristic, it may fail to encode complex priors into the GP kernel, which may lead to performance loss in complex scenarios. Hence, it is essential to strengthen the representative ability of the GP kernel and optimize the parameters of the kernel automatically. To this end, the (deep) neural network can be incorporated into a simple kernel so as to yield a more powerful kernel. A more flexible learning model is shown in Figure 1.4. To bring the advantages of the structure into full play, it is vital to tailor an efficient training method. Refer to Zhang et al. [39] for more details.
Figure 1.4 Network structure tailored for beam prediction oriented GP learning.
Figure 1.5 A visualization of SP inference that incorporates the nonparametric prior (e.g., smoothness). (a) Prior dataset and (b) inference instance.
We further tackle the inference issue, which includes qualitative inference and quantitative inference. First, to obtain an intuitive understanding, we use a simple inference example to qualitatively demonstrate the essence of incorporating a nonparametric prior into SP inference. As illustrated in Figure 1.5, given a prior dataset (i.e., three sample paths) in Figure 1.5a and three data points (labeled by “+”) in Figure 1.5b, the nonparametric SP inference paradigm is based on the fact that the prior sample paths are smooth and their variations are also small while the three-dotted curves are nonsmooth. Hence, it is reasonable to infer that the solid curves in Figure 1.5b have a higher probability of being the real sample path than the dotted ones. Next, we consider quantitative inference.
The basic principle of predicting or inferring for an unknown time based on a set of observations or samples is as follows. In contrast to other ML models or methods, GP regression is based on the Bayesian inference, which generates a probability distribution for the quantity of interest rather than a simple point estimate. Hence, it can provide a quantitative characterization of prediction uncertainty. Given the above , the conditional distribution of is given by
where the th element of and the th element of are calculated as and , respectively. is calculated as .
Next, we proceed to use SDE to model beam direction. Although the choice of SP, such as GP, to model the beam direction is very natural, it may not be intuitive to do this via SDE. In fact, the SDE theory has shown that the solution of an SDE is an SP. Moreover, SDE has a close relationship with dynamical system theory, which enriches our research means, e.g., it can better describe the variation of beam direction. The relationship between SP and SDE, as well as further extension of SDE, will be discussed in Section 1.6.
In practice, the system dynamics is almost always characterized by a system of SDEs, which can be written as
where and represent the system state and Brownian motion, respectively. and (modulated by ) characterize the deterministic and stochastic parts of system evolution, respectively. From a control point of view, the drift term controls the system to achieve a good predictive accuracy, while the diffusion term characterizes model uncertainty in a stochastic environment [43].
A prominent advantage of the SDE-based prediction model is that it is good at quantifying uncertainty, which is often an urgent need for many real-life applications, especially BPT. As mentioned earlier, there are two types of uncertainties, i.e., aleatoric uncertainty (natural randomness inherent in a prediction task) and epistemic uncertainty (model uncertainty caused by lack of observation data). In many tasks, it is important to separate the two sources of uncertainties, while many existing methods suffer from the drawback of conflating the two types of uncertainties. Compared to existing models, the SDE-based models are able to separate the two sources of uncertainties in prediction, by explicitly modeling aleatoric uncertainty and epistemic uncertainty.
Similar to other ML models, modeling and learning are of great importance and have to be well addressed. There are mainly two methods to address these issues, including the direct method and the indirect method. The direct method models the drift term and diffusion term via parametric models directly, typically via the DNN. As for the training of an SDE-based model, it can be addressed via expectation maximization, variational Bayes, and other methods. As SDE can also be solved via various numerical or simulation methods, such as the Euler–Maruyama, Taylor series approximation, and Runge–Kutta methods, numerical- or simulation-based training and inference algorithms have been developed [43, 44].
The indirect method exploits