107,99 €
Understand the cutting-edge technology of semantic communications and its growing applications
Semantic communications constitute a revolution in wireless technology, combining semantic theory with wireless communication. In a semantic communication, essential information is encoded at the source, drastically reducing the required data usage, and then decoded at the destination in such a way that all key information is recovered, even if transmission is damaged or incomplete. Enhancing the correspondence between background knowledge at source and destination can drive the data usage requirement even lower, producing ultra-efficient information exchanges with ultra-low semantic ambiguity.
Wireless Semantic Communications offers a comprehensive overview of this groundbreaking field, its development, and its future application. Beginning with an introduction to semantic communications and its foundational principles, the book then proceeds to cover transceiver design and methods, before discussing use cases and future developments. The result is an indispensable resource for understanding the future of wireless communication.
Readers will also find:
Wireless Semantic Communications is ideal for electrical and computing engineers and researchers, as well as industry professionals working in wireless communications.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 398
Veröffentlichungsjahr: 2024
Cover
Table of Contents
Title Page
Copyright
List of Contributions
Preface
1 Intelligent Transceiver Design for Semantic Communication
1.1 Knowledge Base
1.2 Source and Channel Coding
1.3 Multiuser SC
1.4 Transceiver Design for Single‐Modal and Multimodal Data
1.5 Challenges and Future Directions
References
2 Joint Cell Association and Spectrum Allocation in Semantic Communication Networks
2.1 Introduction
2.2 Semantic Communication Model
2.3 Optimal CA and SA Solution in the PKM‐Based SC‐Net
2.4 Optimal CA and SA Solution in the IKM‐Based SC‐Net
2.5 Numerical Results and Discussions
2.6 Conclusions
References
Notes
3 An End‐to‐End Semantic Communication Framework for Image Transmission
3.1 Introduction
3.2 The End‐to‐End Image Semantic Communication Framework Driven by Knowledge Graph
3.3 Semantic Similarity Measurement
3.4 Simulation
3.5 Conclusion
References
Note
4 Robust Semantic Communications and Privacy Protection
4.1 Motivation and Introduction
4.2 Robust Semantic Communication
4.3 Knowledge Discrepancy‐Oriented Privacy Protection for Semantic Communication
4.4 Conclusion
References
5 Interplay of Semantic Communication and Knowledge Learning
5.1 Introduction
5.2 Basic Concepts and Related Works
5.3 A KG‐enhanced SemCom System
5.4 A KG Evolving‐based SemCom System
5.5 LLM‐assisted Data Augmentation for the KG Evolving‐Based SemCom System
5.6 Conclusion
References
6 VISTA: A Semantic Communication Approach for Video Transmission
6.1 Introduction
6.2 Video Transmission Framework in VISTA
6.3 SLG‐Based Transceiver Design in VISTA
6.4 Simulation Results and Discussions
6.5 Conclusions
References
7 Content‐Aware Robust Semantic Transmission of Images over Wireless Channels with GANs
7.1 Introduction
7.2 System Model
7.3 System Architecture
7.4 Experimental Results
7.5 Conclusion
References
8 Semantic Communication in the Metaverse
8.1 Introduction
8.2 Related Work
8.3 Unified Framework for SemCom in the Metaverse
8.4 Zero‐Knowledge Proof‐Based Semantic Verification
8.5 Diffusion Model‐Based Resource Allocation
8.6 Simulation Results
8.7 Future Directions
8.8 Conclusion
References
9 Large Language Model‐Assisted Semantic Communication Systems
9.1 Introduction
9.2 SSSC Using Pretrained LLMs
9.3 SIAC Using Pretrained LLMs
9.4 Future Direction of Using LLMs: Semantic Correction
9.5 Conclusion
Acronyms
References
Notes
10 RIS‐Enhanced Semantic Communication
10.1 RIS‐Empowered Communications
10.2 Beamforming Design for RISs Enhanced Semantic Communications
10.3 Privacy Protection in RIS‐Assisted Semantic Communication System
10.4 AI for RIS‐Assisted Semantic Communications
10.5 Conclusion
Acronyms
References
Index
End User License Agreement
Chapter 3
Table 3.1 Performance evaluation results of the semantic similarity.
Chapter 4
Table 4.1 Details about the datasets.
Chapter 5
Table 5.1 Mainly used notations in this chapter.
Table 5.2 Experimental settings.
Table 5.3 Example of semantic decoding process based on knowledge enhancemen...
Table 5.4 The comparison of the BLEU score between a fixed extractor model a...
Chapter 9
Table 9.1 Important (“1”) and non‐important (“0”) words in “It is an importa...
Table 9.2 Semantic loss of each word in the sentence “It is an important ste...
Table 9.3 Utilizing ChatGPT for semantic correction of missing word in sente...
Table 9.4 Utilizing BERT for semantic correction of missing word in sentence...
Chapter 1
Figure 1.1 Two examples of KG systems based on KB. (a) DBpedia KB and (b) Ox...
Figure 1.2 An SC system without a KB. (a) Training phase of an SC system wit...
Figure 1.3 An SC system with KBs at both transmitter and receiver.
Figure 1.4 A multi‐modal SC system assisted by KBs.
Figure 1.5 The transceiver architecture of a conventional communication syst...
Figure 1.6 The transceiver architecture of an SC system.
Figure 1.7 The transceiver architecture of an SC with independent source and...
Figure 1.8 The transceiver architecture of an SC with intelligent joint sour...
Figure 1.9 The architecture of an OFDMA‐assisted SC system.
Figure 1.10 The architecture of an SDMA‐assisted SC system.
Figure 1.11 The architecture of a NOMA‐assisted SC system.
Figure 1.12 The architecture of an RSMA‐assisted SC system.
Figure 1.13 The architecture of an MDMA‐assisted SC system.
Figure 1.14 The architecture of a DeepMA‐assisted SC system.
Figure 1.15 An example of a multimodal SC system.
Figure 1.16 The challenges and future directions of SC system.
Chapter 2
Figure 2.1 An overview of SC‐Net.
Figure 2.2 The PKM‐based SC‐Net and the IKM‐based SC‐Net for a single SemCom...
Figure 2.3 The SemCom diagram with the information source and destination.
Figure 2.4 The BLEU score (1‐gram) versus bit rates.
Figure 2.5 Demonstration of B2M transformation in the PKM‐based SC‐Net.
Figure 2.6 Comparison of the PKM‐based STM performance under different numbe...
Figure 2.7 Comparison of the PKM‐based STM performance under different numbe...
Figure 2.8 The IKM‐based STM performance against varying number of MUs.
Figure 2.9 The IKM‐based STM performance against varying number of MUs.
Figure 2.10 The IKM‐based STM performance against different numbers of BSs....
Figure 2.11 The IKM‐based STM performance against different numbers of BSs....
Chapter 3
Figure 3.1 The end‐to‐end semantic communication framework with the knowledg...
Figure 3.2 Relational tensor given entity categories and possible pred...
Figure 3.3 Pixel of an grayscale image can be formed to the corresponding pi...
Figure 3.4 The semantic information based on the knowledge graph can be conv...
Figure 3.5 The new 3D tensor transmitted through the physical channel can be...
Figure 3.6 The Wasserstein distance and Gromov–Wasserstein distance in graph...
Figure 3.7 The impact of wireless channel fading on perceived visual perform...
Figure 3.8 Examples of reconstructed images produced by our proposed scheme,...
Chapter 4
Figure 4.1 A semantic communication systems for image transmission with sema...
Figure 4.2 A robust semantic communication system.
Figure 4.3 Classification accuracy versus signal‐to‐noise ratio in AWGN chan...
Figure 4.4 Classification accuracy versus signal‐to‐noise ratio in Rician ch...
Figure 4.5 Classification accuracy versus signal‐to‐noise ratio in Rayleigh ...
Figure 4.6 A semantic communication system for speech transmission with sema...
Figure 4.7 Semantic communication system with KDPP.
Figure 4.8 An example of proposed knowledge inference method with knowledge ...
Figure 4.9 Schematic diagram of the path cutting‐off module. It receives the...
Figure 4.10 The effect of factor on accuracy (, ).
Figure 4.11 The effect of factor on accuracy (). (a) Synthetic network an...
Figure 4.12 Percentage of different precision in semantic label generation r...
Figure 4.13 The information loss (a) and privacy risk (b) of four protection...
Chapter 5
Figure 5.1 The framework of the SemCom system.
Figure 5.2 The KG‐enhanced semantic decoder.
Figure 5.3 The BLEU and sentence‐BERT score versus SNR for the KG‐enhanced S...
Figure 5.4 The BLEU and sentence‐BERT score versus SNR for the KG‐enhanced S...
Figure 5.5 The precision and recall rate versus SNR for the KG‐enhanced SemC...
Figure 5.6 The comparison of the BLEU score for knowledge extractor with dif...
Figure 5.7 The structure of the KG evolving‐based SemCom receiver.
Figure 5.8 The schematic diagram of the training process of unified semantic...
Figure 5.9 The comparison of precision and recall rate between the newly dev...
Figure 5.10 Examples of spatial visualization of unified semantic representa...
Figure 5.11 An example of LLM‐assisted data augmentation solution for knowle...
Figure 5.12 The performance evaluation of the proposed LLM‐enhanced system....
Chapter 6
Figure 6.1 The diagram of transceiver in VISTA.
Figure 6.2 The examples of SLGs in original video.
Figure 6.3 The frame samples recovered by VISTA under varying SNRs from to...
Figure 6.4 Visual comparison on frame samples for original video, recovered ...
Figure 6.5 PSNR performance of recovered video frames versus varying SNRs fr...
Figure 6.6 Total processing time (a) and transmission bits (b) for consecu...
Chapter 7
Figure 7.1 The proposed content‐aware robust semantic transmission systems,
Figure 7.2 The network architecture of our proposed system. It is denoted th...
Figure 7.3 The visual comparison between images with different quantization ...
Figure 7.4 Performance under different quantization bits of RONI over Raylei...
Figure 7.5 Robustness performance under Rayleigh channels. [0, 10] dB repres...
Chapter 8
Figure 8.1 A unified framework bridging meaning in the metaverse.
Figure 8.2 SemCom in metaverse.
Figure 8.3 Dishonest semantic transformation.
Figure 8.4 Zero‐knowledge proof‐based verification.
Figure 8.5 Strategic resource allocation mechanism.
Figure 8.6 Performance loss of attacks.
Figure 8.7 Semantic similarity without the verification mechanism. (a) and...
Figure 8.8 Semantic similarity comparison.
Figure 8.9 Semantic similarity performance with the verification mechanism. ...
Figure 8.10 ZKP computation time.
Figure 8.11 Training curves of the joint resource allocation.
Figure 8.12 Generated utility of diffusion compared with PPO.
Chapter 9
Figure 9.1 Demonstration of signal‐shaping semantic communication systems, w...
Figure 9.2 An example to show the semantic similarity computed by the pretra...
Figure 9.3 Convergence property of Algorithm 9.1 with different randomly g...
Figure 9.4 Signal constellation designs for semantic communication systems w...
Figure 9.5 Semantic loss of different signal‐shaping methods.
Figure 9.6 Demonstration of SIAC systems, which target to reliably transmit ...
Figure 9.7 SIAC cross‐layer structure with pretrained LLMs (e.g., ChatGPT an...
Figure 9.8 Expected important words errors versus the total power, where the...
Figure 9.9 Expected semantic loss versus the total power, where the semantic...
Figure 9.10 Word generation speed of ChatGPT with and without highlighting t...
Figure 9.11 Frame importance quantification time of self‐hosted BERT and Cha...
Chapter 10
Figure 10.1 Average EE using either RIS or AF relay versus for bps/Hz an...
Figure 10.2 A wireless communication system with distributed RISs, one BS, a...
Figure 10.3 The compute‐then‐transmit protocol for semantic communication sy...
Figure 10.4 Sum rate versus sum transmit power under user distribution 1.
Figure 10.5 Sum rate versus sum transmit power under user distribution 2.
Figure 10.6 Task success rate versus compressed rate.
Figure 10.7 Proposed encryption and obfuscation methods.
Figure 10.8 Comparison between traditional semantic communications and RIS‐b...
Figure 10.9 PSNR of the proposed scheme on CIFAR‐10 test images with respect...
Figure 10.10 The framework of the proposed inverse semantic‐aware wireless s...
Figure 10.11 Proposed encryption and obfuscation methods.
Cover
Table of Contents
Title Page
Copyright
List of Contributions
Preface
Begin Reading
Index
End User License Agreement
iv
xiii
xiv
xv
xvi
xvii
xviii
xix
xx
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
199
200
201
202
203
Edited byYao SunUniversity of GlasgowGlasgow, UK
Lan ZhangMichigan Technological UniversityHoughton, USA
Dusit NiyatoNanyang Technological UniversitySingapore
Muhammad Ali ImranUniversity of GlasgowGlasgow, UK
This edition first published 2025
© 2025 John Wiley & Sons Ltd.
All rights reserved, including rights for text and data mining and training of artificial intelligence technologies or similar technologies. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Yao Sun, Lan Zhang, Dusit Niyato and Muhammad Ali Imran to be identified as the authors of the editorial material in this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty
While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging‐in‐Publication Data applied for:
Hardback: 9781394223305
Cover Design: Wiley
Cover Image: © Yuichiro Chino/Getty Images
Xuyang Chen
College of Electronics and Information Engineering
Shenzhen University
Shenzhen
China
Runze Cheng
James Watt School of Engineering
University of Glasgow
Glasgow
Lanarkshire
UK
Xiangyi Deng
Glasgow College
University of Electronic Science and Technology of China
Chengdu
Sichuan
China
Hongyang Du
School of Computer Science and Engineering
Nanyang Technological University
Singapore
Daquan Feng
College of Electronics and Information Engineering
Shenzhen University
Shenzhen
China
Lei Feng
The State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications
Beijing
China
Biqian Feng
Department of Electronic Engineering
Shanghai Jiao Tong University
China
Chenyuan Feng
Eurecom
France
Zhipeng Gao
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications
Beijing
China
Shuaishuai Guo
School of Control Science and Engineering
Shandong University
Jinan
China
and
Shandong Key Laboratory of Wireless Communication Technologies
Shandong University
Jinan
China
Qi He
National Key Laboratory of Science and Technology on Communications
University of Electronic Science and Technology of China
Chengdu
China
Chongwen Huang
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
Xihu Region
China
Muhammad Ali Imran
James Watt School of Engineering
University of Glasgow
Glasgow
Lanarkshire
UK
Rongpeng Li
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
China
Wenjing Li
The State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications
Beijing
China
Chengsi Liang
James Watt School of Engineering
University of Glasgow
Glasgow
Lanarkshire
UK
Yijing Lin
State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications
Beijing
China
Yijie Mao
School of Information Science and Technology
ShanghaiTech University
Shanghai
China
Fei Ni
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
China
Dusit Niyato
School of Computer Science and Engineering
Nanyang Technological University
Singapore
Yao Sun
James Watt School of Engineering
University of Glasgow
Glasgow
Lanarkshire
UK
Bingyan Wang
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
China
Bohao Wang
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
Xihu Region
China
Jiacheng Wang
School of Computer Science and Engineering
Nanyang Technological University
Singapore
Yanhu Wang
School of Control Science and Engineering
Shandong University
Jinan
China
and
Shandong Key Laboratory of Wireless Communication Technologies
Shandong University
Jinan
China
Yiwen Wang
School of Information Science and Technology
ShanghaiTech University
Shanghai
China
Le Xia
James Watt School of Engineering
University of Glasgow
Glasgow
Lanarkshire
UK
Xiang‐Gen Xia
Department of Electrical and Computer Engineering
University of Delaware
Newark, DE
USA
Ruopeng Xu
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
Xihu Region
China
Zhaohui Yang
College of Information Science & Electronic Engineering
Zhejiang University
Hangzhou
Xihu Region
China
Honggang Zhang
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
China
and
Zhejiang Lab
Hangzhou
China
Xuefei Zhang
School of Information and Communication Engineering
Beijing University of Posts and Telecommunications
Beijing
China
Zhifeng Zhao
College of Information Science and Electronic Engineering
Zhejiang University
Hangzhou
China
and
Zhejiang Lab
Hangzhou
China
Jiakang Zheng
School of Electronic and Information Engineering
Beijing Jiaotong University
Beijing
China
Yu Zhou
The State Key Laboratory of Networking and Switching Technology
Beijing University of Posts and Telecommunications
Beijing
China
Tremendous traffic demands have increased in current wireless networks to accommodate for the upcoming pervasive network intelligence with a variety of advanced smart applications. In response to the ever‐increasing data rates along with stringent requirements for low latency and high reliability, it is foreseeable that available communication resources like spectrum or energy will gradually become scarce in the upcoming years. Combined with the almost insurmountable Shannon limit, these destined bottlenecks are, therefore, motivating us to hunt for bold changes in the new design of future networks, i.e., making a paradigm shift from bit‐based traditional communication to context‐based semantic communication.
The concept of semantic communication was first introduced by Weaver in his landmark paper, which explicitly categorizes communication problems into three levels, including the technical problem at the bit level, the semantic problem at the semantic level, and the effectiveness problem at the information exchange level. Nowadays, the technical problem has been thoroughly investigated in the light of classical Shannon information theory, while the evolution toward semantic communication is just beginning to take shape, with the core focus on meaning delivery rather than traditional bit transmission.
Concretely, semantic communication first refines semantic features and filters out irrelevant content by encoding the semantic information (i.e., semantic encoding) at the source, which can greatly reduce the number of required bits while preserving the original meaning. Then, the powerful semantic decoders are deployed at the destination to accurately recover the source meaning from received bits (i.e., semantic decoding), even if there are intolerable bit errors at the syntactic level. Most importantly, through further leveraging matched background knowledge with respect to the observable messages between source and destination, users can acquire efficient exchanges for the desired information with ultralow semantic ambiguity by transmitting fewer bits.
While semantic communication offers these attractive and valuable benefits, it also faces many challenges. For example, when considering the computing limitation of terminal devices, personalized background knowledge, as well as unstable wireless channel conditions, how to design semantic encoder and decoder should be a challenging issue. Moreover, in the networking layer, it is nontrivial to seek the optimal wireless resource management strategy to optimize its overall network performance in a semantics‐aware manner. Therefore, before fully enjoying the superiorities of semantic communications, this book would like to explore the following fundamental issues: (i) How many benefits can be achieved by using semantic communication? (ii) How much cost (mainly consumed resources) is incurred to guarantee the required performance, such as semantic ambiguity? and (iii) How can we maximize the benefits of semantic communications applied to different wireless networks with constraints of cost?
This book will explore recent advances in the theory and practice of semantic communication. In detail, the book covers the following aspects:
(1) Principles and fundamentals of semantic communication.
(2) Transceiver design of semantic communications.
(3) Resource management in semantic communication networks.
(4) Semantic communication applications to vertical industries and some typical communication scenarios.
Chapter 1 delves into the transceiver design for semantic communications. Specifically, we first summarize established designs for key components in semantic communications, with a key focus on the knowledge base and semantic encoders/decoders crucial for single‐user semantic communications. Our discussion then extends to multiuser SC, specifically emphasizing the synergy between various multiple‐access schemes and semantic communications, including orthogonal multiple access (OMA), space‐division multiple access (SDMA), non‐orthogonal multiple access (NOMA), rate‐splitting multiple access (RSMA), and model‐division multiple access (MDMA). In the end, we explore various applications of semantic communications and analyze the potential alterations in transceiver design required for these applications.
Chapter 2 studies semantic communications from a networking perspective, particularly focusing on the upper layer. Our primary objective is to investigate optimal wireless resource management strategies within the semantic communications‐enabled network (SC‐Net) to enhance overall network performance in a semantics‐aware manner. This entails addressing the unique challenge of ensuring background knowledge alignment between multiple mobile users (MUs) and multitier base stations (BSs). Efficient resource management remains paramount within the SC‐Net, offering numerous benefits such as guaranteeing high‐quality SemCom services and enhancing spectrum utilization. By devising effective resource allocation strategies, we aim to optimize network performance and facilitate seamless communication within the SC‐Net ecosystem.
Chapter 3 innovatively proposes the transformation theory of the semantic domain–spatial domain, projecting the knowledge graph onto a three‐dimensional tensor in the spatial domain. By mapping the entity's semantic ambiguity with the intensity of discrete points, the knowledge graph is reconstructed from background knowledge libraries and three‐dimensional tensors at the receiving end. Additionally, this chapter proposes a graph‐to‐graph semantic similarity (GGSS) metric based on graph optimal transport theory to evaluate the similarity of semantic information before and after transmission, as well as a semantic‐level image‐to‐image semantic similarity (IISS) metric that aligns with human perception. Finally, we demonstrate the effectiveness and rationality of the framework through simulations.
Chapter 4 first introduces the problem that neural networks in semantic communication are very vulnerable to adversarial attacks, then proposes robust semantic communication systems for image and speech transmission. Meanwhile, this chapter discusses the privacy issue caused by the difference of knowledge bases between transmitter and receiver in semantic communication and proposes a knowledge discrepancy‐oriented privacy protection (KDPP) method for semantic communication to reduce the risk of privacy leakage while retaining high data utility.
Chapter 5 investigates the means of knowledge learning in semantic communication with a particular focus on the utilization of Knowledge Graphs (KGs). Specifically, we first review existing efforts that combine semantic communication with knowledge learning. Subsequently, we introduce a KG‐enhanced semantic communication system, wherein the receiver is carefully calibrated to leverage knowledge from its static knowledge base for ameliorating the decoding performance. Contingent upon this framework, we further explore potential approaches that can empower the system to operate in evolving an knowledge base more effectively. Furthermore, we investigate the possibility of integration with large language models (LLMs) for data augmentation, offering additional perspective into the potential implementation means of semantic communication. Extensive numerical results demonstrate that the proposed framework yields superior performance on top of the KG‐enhanced decoding and manifests its versatility under different scenarios.
Chapter 6 presents a novel framework called VISTA (VIdeo transmission over Semantic communicaTion Approach) for video transmission by exploiting semantic communications. VISTA comprises three key modules: the semantic segmentation module and the frame interpolation module, responsible for semantic encoding and decoding, respectively, and the joint source‐channel coding (JSCC) module, designed for SNR‐adaptive wireless transmission.
Chapter 7 proposes a content‐aware robust semantic communication framework for image transmission based on generative adversarial networks (GANs). Specifically, the accurate semantics of the image are extracted by the semantic encoder and divided into two parts for different downstream tasks: regions of interest (ROI) and regions of non‐interest (RONI). By reducing the quantization accuracy of RONI, the amount of transmitted data volume is reduced significantly. During the transmission process of semantics, a signal‐to‐noise ratio (SNR) is randomly initialized, enabling the model to learn the average noise distribution. The experimental results demonstrate that by reducing the quantization level of RONI, transmitted data volume is reduced up to 60.53% compared to using globally consistent quantization while maintaining comparable performance to existing methods in downstream semantic segmentation tasks. Moreover, our model exhibits increased robustness with variable SNRs.
Chapter 8 first proposes an integrated framework for bridging meanings of semantic information in the Metaverse to achieve efficient interaction between physical and virtual worlds. This chapter then presents a Zero Knowledge Proof‐based verification mechanism to secure the authenticity of the extracted information. This chapter also introduces a diffusion model‐based resource allocation mechanism to maximize the utility of resources. Simulation results are presented to validate the authenticity and efficiency of the proposed mechanisms. Additionally, this chapter discusses future directions to further advance SemCom in the metaverse.
Chapter 9 discusses the method of leveraging large language model (LLM) to assist semantic communication systems. Specifically, this chapter first discusses leveraging LLM to define the semantic loss of communications, based on which a signal‐shaping method is proposed to minimize the semantic loss for semantic communications with a few message candidates. Then, this proposed a more generalized method to quantify the semantic importance of a word/frame using LLM and investigate semantic importance‐aware communications (SIAC) to reliably convey the semantics with limited communication and network resources. Finally, this chapter points out the future direction of using LLM for semantic correction. Experiments are conducted to verify the effectiveness of leveraging LLM to assist semantic communications.
Chapter 10 explores cutting‐edge advancements in RIS for semantic communication. It delves into three pivotal areas: optimizing beamforming in RIS‐aided systems for enhanced communication in complex digital environments like the Metaverse; employing physical layer strategies for robust privacy protection in semantic communication systems; and leveraging deep learning for advanced interpreting and prioritizing data and encoding and decoding semantic transmission in wireless communications. These topics collectively highlight the potential of RIS in transforming communication paradigms, emphasizing efficiency, security, and intelligent data processing in semantic communication.
Yao Sun
Lan Zhang
Dusit Niyato
Muhammad Ali Imran
April 2024
Yiwen Wang1, Yijie Mao1, and Zhaohui Yang2
1School of Information Science and Technology, ShanghaiTech University, Shanghai, China
2College of Information Science & Electronic Engineering, Zhejiang University, HangZhou, Xihu Region, China
In this section, we first introduce the basic structure of a knowledge base (KB) and then describe the development of KB‐assisted semantic communication (SC) systems. After that, we introduce multimodal SC systems that are assisted by KB.
The KB plays a vital role in supporting the process of SC. It involves real‐world information such as facts, relationships, and possible reasoning methods that can be understood, recognized, and learned by all participants in the communication [Shi et al., 2021].
One of the typical KBs is the knowledge graph (KG), which contains entities as nodes and relationships as edges [Strinati and Barbarossa, 2020]. The knowledge hidden in the KG is embedded into a continuous vector space [Wang et al., 2017]. One popular KG design is to describe the relationship between two entities using triples. These triples are typically structured as (head entity, relation, and tail entity) or (entity, relation, and attribute). Figure 1.1 shows two different famous KG‐based KBs, namely DBpedia and Oxford Art. Both Figure 1.1a and b show detailed information about the same artist, Eugenio Bonivento. Different entities connected with edges are tagged by their relationship. In the DBpedia‐based KB shown in Figure 1.1a, information about Eugenio Bonivento is stored using text triples. For example, the triple (Eugenio Bonivento, Date of birth, 1880‐06‐08) in Figure 1.1a represents the meaning that Eugenio Bonivento's date of birth is 1880‐06‐08. Moreover, an entity can be connected with more than one entity or attribute. In the Oxford Art KG shown in Figure 1.1b, the artist is represented by an id “OOUNzqQ2.” Unlike DBpedia, which utilizes text triples to store information about Eugenio Bonivento, the Oxford Art KG incorporates entities with multimodal data, including images and texts. Such multimodal data‐based KB facilitates multimodal SC and provides ways to understand information from different modal information.
KB can help to assist SC systems in many ways. However, it is worth mentioning that some simple transmission frameworks in SC do not involve KB. Instead, they may take advantage of the background knowledge during the training process, but they do not construct a KB for assistance during the encoding and decoding process when applying the trained SC [Xie et al., 2021b; Hu et al., 2022; Jiang et al., 2022a; Zhang et al., 2023a]. The structures of both the training and inference phases for an SC system without a KB are shown in Figure 1.2. In this example, neural networks are deployed in the semantic encoding and semantic decoding blocks, and the background knowledge is used in the training phase only to get a fine‐tuned neural network. Once the training phase is complete, the background knowledge is unused, and only the pretrained neural networks work in the inference phase.
Figure 1.1 Two examples of KG systems based on KB. (a) DBpedia KB and (b) Oxford Art KB.
Source: Adapted from [Collarana et al., 2017].
As mentioned previously, a widely recognized KB definition is an underlying set of facts, assumptions, and rules to help a computer system solve a problem. In general, both the transmitter and receiver have their own KBs, called “source KB” and “destination KB.” Those two KBs typically share a part of common knowledge while keeping a part of private knowledge. With the help of the common knowledge in KBs, the reasoning ability of the semantic encoder and decoder is enhanced. However, the difference between the two KBs would inevitably cause semantic noise since the same message can be understood differently with different background knowledge. This semantic noise can be eliminated by sharing the knowledge between the source and destination KBs [Wang et al., 2017].
As shown in Figure 1.3, a structured KB assists an SC system during both the training and inference phases. At the transmitter, the information is transformed into triples, and the triples are subsequently transmitted through conventional communication approaches and reconstructed using the KB at the decoder.
Typically, different modalities of information require different types of KB. As mentioned earlier, various types of modalities exist, such as text, image, speech, or video [Luo et al., 2022a]. With the development of SC, it becomes imperative to design KBs capable of bridging different modalities.
Li et al. [2022] designed and introduced a novel KB that is adaptable across various modalities. To extract semantic knowledge and build a cross‐modal KG (CKG), different knowledge extraction models are designed for different modalities after clustering multimodal samples. Then, knowledge entities, relations, and attributes are identified through similarity measurement from the pairwise similarity scores obtained from the previous step to construct a cross‐modal KG. Through semantic extraction, the original message is mapped into multiple triples, which are then transmitted using traditional communication ways and restored at the receiver's side [Zhou et al., 2022]. It is noted that the restored message is not exactly the same as the original message, but they are semantically equivalent. Moreover, this work proposed a CKG fusion to aggregate multi‐source, heterogeneous, and diverse knowledge obtained by new knowledge extraction and existing CKGs. Based on those CKGs, the semantic encoder can acquire essential signal patches, thereby improving the quality of the recovered signals.
Figure 1.2 An SC system without a KB. (a) Training phase of an SC system without a KB and (b) interfering phase of an SC system without a KB.
Figure 1.3 An SC system with KBs at both transmitter and receiver.
Figure 1.4 A multi‐modal SC system assisted by KBs.
Jiang et al. [2023a] proposed a personalized large language model (LLM)–based KB (LKB) to extract semantics from the data and reconstruct the data. It should be noted that this LKB is dedicated to processing text data. The proposed large artificial intelligence (AI) model‐based multimodal SC (LAM‐MSC) framework is capable of handling multimodal data such as image, audio, video, and text. The key point to achieve multimodal data processing is to transform different modal data into text data before semantic extraction based on LKB and data transmission. After data transmission, the data are first recovered based on LKB, and then further recovered based on multimodal alignment (MMA).
Both Li et al. [2022] and Jiang et al. [2023a] proposed KB‐assisted multimodal SC systems following the process shown in Figure 1.4, which handle the multimodal data by extracting the hidden semantics. The primary difference between them lies in the semantic extraction part of the semantic encoder, where Li et al. [2022] employed the CKG to get the implicit semantics, and Jiang et al. [2023a] first transformed other non‐text modal data into text data before extracting semantics using the LKB.
In this section, we first introduce the components of the source and channel encoders and decoders in SC. Then, two different transceiver architectures with different levels of integration between source and channel coding are delineated.
In the traditional communication process, there are two essential components at the transceiver: source coding and channel coding. Source coding, typically implemented with a source encoder at the transmitter, takes the source message and applies encoding techniques to compress the data in order to minimize the number of bits to represent the information while maintaining the effectiveness of communication. The channel coding implemented with a channel encoder introduces redundancy to data before transmission so that the system has the ability for error correction and anti‐interference. It can greatly avoid the occurrence of errors during transmission. Corresponding to the source and channel encoders at the transmitter, there are source and channel decoders at the receiver to perform the reverse decoding processes, as illustrated in Figure 1.5.
Figure 1.5 The transceiver architecture of a conventional communication system.
Source: Adapted from Lan et al. [2021].
In the SC system, as Figure 1.6, the source encoder in the traditional communication system is typically replaced by a semantic encoder, while the semantic decoder is used to replace the source decoder. A semantic encoder is able to infer the meaning of the source message and make sure the meaning can be successfully delivered. This contrasts with a conventional source encoder, which typically ignores the meaning of the message and encodes every bit of the source message in order to make sure all message bits can be recovered at the receiver [Shi et al., 2021]. With this alteration, semantic encoding and decoding in the SC system not only extract the semantic information but also enhance the effectiveness of communication by reducing the number of transmission bits.
Figure 1.6 The transceiver architecture of an SC system.
Source: Adapted from Lan et al. [2021].
Empowered by the emerging deep‐learning (DL) technologies, semantic encoder, semantic decoder, channel encoder, and channel decoder in the SC system can be implemented by deep neural networks (DNNs). Compared to traditional communication systems, SC enables manipulation of messages at the semantic level. The design of semantic and channel encoders/decoders exploits the semantics of the messages, leading to a deeper and more accurate understanding of the intended message.
Based on the design of semantic and channel encoders as well as semantic and channel decoders, the SC system can be divided into two categories, as illustrated in Figures 1.7 and 1.8. The first type of SC considers the semantic encoder/decoder and channel encoder/decoder as two separate modules, which are independently designed. Typically, the design of semantic encoders and decoders relies on DNNs, and the channel encoders and decoders are implemented in a traditional way. Extracted knowledge from the semantic extraction is processed and transmitted in a more traditional manner. The second type of SC enables joint source–channel coding by designing semantic and channel encoders and decoders using DNNs. These modules are jointly trained in an end‐to‐end fashion so as to achieve a unified design purpose.
Figure 1.7 The transceiver architecture of an SC with independent source and channel coding.
Figure 1.8 The transceiver architecture of an SC with intelligent joint source and channel coding.
Source: Adapted from Lan et al. [2021].
In the first type of SC, as shown in Figure 1.7, the semantic information is first extracted from the original messages and stored as the directional possibility graph (DPG) or an other KG. It should be noted that the KG mentioned here keeps the same structure as the KG mentioned in Section 1.1 but with different content. The KG for KB in Section 1.1 stores all background information that may help to improve the quality of SC, while the KG mentioned in this section stores the semantic information extracted from the messages that need to be transmitted. The transformation from the message to the KG or the counterpart is always based on a neural network, which can be assumed as the semantic encoder and decoder. After that, the semantic information stored in the DPG or other KG is encoded into a bitstream and then transmitted. Many existing SC systems of this type have been proposed, i.e., the probability graph‐based semantic information compression system proposed by Zhao et al. [2023], an SC system based on the KG proposed by Jiang et al. [2022b], and another SC assisted by rate‐splitting multiple access (RSMA), which uses the KG proposed by Yang et al. [2023b].
In the second type of SC, as shown in Figure 1.8, the transmitter and receiver are typically designed by two separate DNNs. The DNN at the transmitter carries out the roles of semantic and channel encoders, while the DNN at the receiver carries out the roles of both semantic and channel decoders. Some existing works also consider separate DNNs for the channel encoder/decoder and semantic encoder/decoder, which are connected to each other. Typically, all DNNs in the same SC system are trained jointly with the separated but connected KBs. Again, many existing SC systems of this type have been proposed, i.e., the DeepSC system for text transmission proposed by Xie et al. [2021b], the MR DeepSC system for one‐to‐many communication proposed by Hu et al. [2022], and the DeepWive system, which is an end‐to‐end video transmission scheme proposed by Tung and Gündüz [2022]. Different neural network designs are adopted in these works; however, the basic architectures of these SC systems remain the same.
Single‐user communication refers to a point‐to‐point transmission mode, where a single transmitter establishes direct communication with a single receiver. Conversely, multiuser communication involves a scenario where multiple users or devices engage in communication within a shared system or network. The majority of studies on SC focus on resolving issues within a single‐user SC system, where there is no interference from other users [Bourtsoulatze et al., 2018; Farsad et al., 2018; Xie et al., 2021b; Jiang et al., 2022a; Tung and Gündüz, 2022; Weng and Qin, 2021]. In these works, the transceiver design is centered on communication between a single transmitter and receiver. However, communication systems nowadays usually involve multiple users, demanding an efficient way to distinguish messages from different users. In this section, we focus on multiuser SC, introducing the state‐of‐the‐art approaches to address the interference issue.
In contrast to the traditional communication systems, SC introduces the concepts of semantics and takes advantage of them to better manage inter‐user interference.
In general, there are several approaches to managing inter‐user interference in SC. The first approach is employing personalized KBs for individual users. This ensures the absence of shared knowledge among users, allowing each individual to understand only their own messages and thereby facilitating interference‐free communication. However, in real multiuser SC systems, the overlap of the KBs among users is inevitable, leading to interference from other users [Zhou et al., 2023].
To manage inter‐user interference, some recent works introduced novel models at the transceiver in order to better utilize the semantic information for different users. Hu et al. [2022] introduced DistilBERT, a semantic recognizer built upon the pretrained model. It is designed to distinguish messages from different users based on the emotions of those sentences. Concurrently, Luo et al. [2022b] proposed a method for channel‐level information fusion, where signals from different users are merged in wireless channels. Therefore, the signal received at the receiver is a fused single‐modal information, eliminating multiuser interference.
Besides the aforementioned two approaches, various multiple access (MA) techniques have been introduced in multiuser SC systems to manage inter‐user interference. In Section 1.3.2, we will introduce different MA‐assisted SCs.
Various MA schemes have been proposed to manage inter‐user interference and enhance SC performance, such as orthogonal frequency‐division multiple access (OFDMA), space‐division multiple access (SDMA), non‐orthogonal multiple access (NOMA), RSMA, model division multiple access (MDMA), and deep multiple access (DeepMA). The transceiver design for these MA‐assisted SCs is delineated in this subsection.
OFDMA divides the spectrum into multiple subchannels/subcarriers, allowing multiple users to communicate simultaneously on different subcarriers, thus improving spectral efficiency. Shao and Gunduz [2022] designed a passband transceiver for the OFDMA SC system, in which a differentiable clipping operation is incorporated into the training process.
The OFDMA‐assisted SC proposed by Shao and Gunduz [2022] is shown in Figure 1.9. After joint source channel coding (JSCC) encoding, the source signal is encoded into symbols, which are then normalized and transformed into a power‐normalized real vector. Then, for each block, the signal vector is mapped onto subcarriers. The inverse discrete Fourier transform (IDFT) is applied to each orthogonal frequency‐division multiplexing (OFDM) symbol, and then the cyclic prefix (CP) is appended. After pulse shaping, we have the baseband continuous‐time signal, with which we can construct a passband signal. Moreover, the receiver converts the received signal to the baseband, matches filters, and samples the baseband signal. After that, the signal is recovered by OFDM demodulation, in‐phase and quadrature (IQ) remapping, and JSCC decoding.
Figure 1.9 The architecture of an OFDMA‐assisted SC system.
Source: Adapted from Shao and Gunduz [2022].
SDMA is capable of utilizing the spatial domain to serve multiple users at the same time–frequency resource. It introduces the concept of precoding, which adjusts the phase, amplitude, and spatial properties of the signal to improve the overall signal quality. Guo and Yang [2022] proposed an SC system that designed the precoding strategy with the help of semantic information. Convolutional neural networks (CNNs) are used to learn the semantic/channel encoder/decoder and modulator/demodulator, while a graph neural network (GNN) is adopted to learn the precoding policy managing the interference. All those CNNs and a GNN are jointly trained to achieve the expected classification accuracy in a system that delivers images to multiple users. The SDMA‐assisted SC system proposed by Guo and Yang [2022] is illustrated in Figure 1.10.
Figure 1.10 The architecture of an SDMA‐assisted SC system.
Source: Adapted from Guo and Yang [2022].
Simulation results provided by Guo and Yang [2022] show that the learned precoding policy requires much less bandwidth than the fixed low‐complexity precoding schemes for achieving the same expected classification accuracy.
A semi‐NOMA‐assisted SC system, designed by Mu and Liu [2022], is shown in Figure 1.11