109,99 €
An important text that offers an in-depth guide to how information theory sets the boundaries for data communication
In an accessible and practical style, Information and Communication Theory explores the topic of information theory and includes concrete tools that are appropriate for real-life communication systems. The text investigates the connection between theoretical and practical applications through a wide-variety of topics including an introduction to the basics of probability theory, information, (lossless) source coding, typical sequences as a central concept, channel coding, continuous random variables, Gaussian channels, discrete input continuous channels, and a brief look at rate distortion theory.
The author explains the fundamental theory together with typical compression algorithms and how they are used in reality. He moves on to review source coding and how much a source can be compressed, and also explains algorithms such as the LZ family with applications to e.g. zip or png. In addition to exploring the channel coding theorem, the book includes illustrative examples of codes. This comprehensive text:
Written for graduate and undergraduate students studying information theory, as well as professional engineers, master’s students, Information and Communication Theory offers an introduction to how information theory sets the boundaries for data communication.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 551
Veröffentlichungsjahr: 2019
IEEE Press445 Hoes Lane Piscataway, NJ 08854
IEEE Press Editorial BoardEkram Hossain, Editor in Chief
Giancarlo Fortino
Andreas Molisch
Linda Shafer
David Alan Grier
Saeid Nahavandi
Mohammad Shahidehpour
Donald Heirman
Ray Perez
Sarah Spurgeon
Xiaoou Li
Jeffrey Reed
Ahmet Murat Tekalp
Copyright © 2019 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN 978-1-119-43378-1
Cover
IEEE Press Editorial Board
Preface
CHAPTER 1 Introduction
CHAPTER 2 Probability Theory
2.1 Probabilities
2.2 Random Variable
2.3 Expectation and Variance
2.4 The Law of Large Numbers
2.5 Jensen’s Inequality
2.6 Random Processes
2.7 Markov Process
Problems
Notes
CHAPTER 3 Information Measures
3.1 Information
3.2 Entropy
3.3 Mutual Information
3.4 Entropy of Sequences
Problems
Note
CHAPTER
4
Optimal Source Coding
4.1 Source Coding
4.2 Kraft Inequality
4.3 Optimal Codeword Length
4.4 Huffman Coding
4.5 Arithmetic Coding
Problems
Notes
CHAPTER
5
Adaptive Source Coding
5.1 The Problem with Unknown Source Statistics
5.2 Adaptive Huffman Coding
5.3 The Lempel–Ziv Algorithms
5.4 Applications of Source Coding
Problems
Notes
CHAPTER 6
Asymptotic Equipartition Property and Channel Capacity
6.1 Asymptotic Equipartition Property
6.2 Source Coding Theorem
6.3 Channel Coding
6.4 Channel Coding Theorem
6.5 Derivation of Channel Capacity for DMC
Problems
Notes
CHAPTER 7
Channel Coding
7.1 Error-Correcting Block Codes
7.2 Convolutional Code
7.3 Error-Detecting Codes
Problems
Notes
CHAPTER 8 Information Measures for Continuous Variables
8.1 Differential Entropy and Mutual Information
8.2 Gaussian Distribution
Problems
Notes
CHAPTER 9 Gaussian Channel
9.1 Gaussian Channel
9.2 Parallel Gaussian Channels
9.3 Fundamental Shannon Limit
Problems
Notes
CHAPTER 10 Discrete Input Gaussian Channel
10.1 M-PAM Signaling
10.2 A Note on Dimensionality
10.3 Shaping Gain
10.4 SNR Gap
Problems
Bibliography
CHAPTER 11 Information Theory and Distortion
11.1 Rate-Distortion Function
11.2 Limit For Fix
P
b
11.3 Quantization
11.4 Transform Coding
Problems
Notes
Bibliography
Appendix A Probability Distributions
1.1 Discrete Distributions
1.2 Continuous Distributions
Appendix B Sampling Theorem
2.1 THE SAMPLING THEOREM
Bibliography
Index
IEEE Press Series On Digital and Mobile Communication
End User License Agreement
Chapter 2
Table 2.1
Chapter 4
Table 4.1
Chapter 5
Table 5.1
Table 5.2
Table 5.3
Table 5.4
Table 5.5
Table 5.6
Table 5.7
Table 5.8
Table 5.9
Table 5.10
Chapter 6
Table 6.1
Chapter 7
Table 7.1
Chapter 11
Table 11.1
Cover
Table of Contents
Chapter
C1
ii
iii
iv
ix
x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
e1
Information theory started as a topic in 1948 when Claude E. Shannon published the paper “A mathematical theory of communication.” As the name reveals, it is a theory about what communication and information are in a mathematical sense. Shannon built a theory first to quantify and measure the information of a source. Then, by viewing communication as reproduction of information, the theory of communication came into view. So, without information, or choices, there cannot be any communication. But these two parts go hand in hand. The information measure is based on the amount of data needed for reconstruction, and communication is based on the amount of data needed to be transmitted to transfer a certain amount of information. In a mathematical sense, choices mean probabilities, and the theory is based on probability theory.
Humans have always strived to simplify the process of communication and spreading knowledge. In the 1450s when Gutenberg managed to get his printing press in operation, it simplified the spreading of the written words. Since then, we have seen many different technologies that have spread the information—from the telegraph, via the telephone and television, to the Internet as we know it today. What will come next we can only speculate, but all these existing and forthcoming technologies must comply with the theories that Shannon stated. Therefore, information theory as a topic is the base for everyone working with communication systems, both past, present, and forthcoming.
This book is intended to be used in a first course in information theory for communication engineering students, typically in higher undergraduate or lower graduate level. Since the theory is based on mathematics and probability theory, a certain level of maturity in these subjects is expected. This means that the students should have a couple of mathematics courses in their trunk, such as basic calculus and probability theory. It is also recommended, but not required, that they have some understanding of digital communication as a concept. This together will give a solid ground for understanding the ideas of information theory. The first requirement of a level of maturity in mathematics comes from information theory that sets up a mathematical model of information in very general terms, in the sense that it is valid for all kinds of communication. On some occasions, the theory can be seen as pretty abstract by students the first time they engage with it. Then, to have some understanding of communication on a physical layer beforehand might help the understanding.
The work with this book, as well as the content, has grown while lecturing the course on information and communication. By the time it started, I used one of the most recommended books in this field. But it lacked the engineering part of the course, such as the intuitive understanding and a description of what it means in reality. The theory sets up bounds for how much a source can be compressed and how much data can be transmitted over a noisy channel. This is the pure information theory. The understanding of what this means when designing a communication system is one of the most important results from the theory.
Information theory has grown since its origin in 1948, particularly in the subjects of compressing a source and getting a high data rate on the communication link. Therefore, I wanted to include the basics of data compression and error-correcting codes in the course that I lectured. Since I did not find a book on these topics, I started writing about data compression and the Lempel–Ziv algorithms, as a complementary material to the students. Since not all students have the up-to-date knowledge of probability theory, I also wrote a couple of pages about this. With this at hand, and the slides I used during the lectures, I started working on a somewhat wider scope with the material. After some time, while working with the material as lecture notes, I realized that I had almost all the material I needed for a course. Then I decided to use this as the main course literature, and the text continued to develop during the years. I feel now that the text is mature enough to take the next step and worth publishing to a broader audience.
The work with this book has taken about 10 years, and over the years I have had a lot of help and discussions to learn and understand the subject. Especially, I would like to thank all the students who have passed the course and patiently endured my attempts to find intuitive explanations for the theory. It has not always been the best at the first attempt. It is interesting that even after working with the material for long time, I still get new inputs and questions from the students that challenge my understanding and help with the description. Second, I would like to thank all my colleagues and co-workers those who have enlightened me during discussions and explanations over the years, to name a few: Rolf Johannesson, Viktor Zyablov, Michael Lentmaier, John B Andersson, Fredrik Rusek, Per-Erik Eriksson, Miguel Berg, Boris Dortchy, and Per Ola Börjesson. A special thank goes to John B. Andersson, without his encouragement this book would have never been completed. I am also grateful to reviewers for their many relevant comments on the manuscript. I would also like to thank the teaching assistants associated with the course over the years, who have detected and corrected many mistakes: Adnan Prlja, Eduardo Medeiros, Yezi Huang, and Umar Farooq.
Finally, I want to thank my wife and children, Camilla, Karla, and Filip, for all their support and understanding throughout the years of working on the manuscript.
STEFAN HöST
Lund, Sweden
At some point in the scientific history, new revolutionary ideas serve as starting points of new topics. Information theory started in 1948 when Claude Shannon’s paper “A mathematical theory of information” was published in the Bell System Technical Journal [1]. The objectives in the article in many ways were pioneering, but the first thoughts in this direction started more than 20 years earlier.
In 1924 and 1928, Nyquist [2, 3] showed that a signal with a bandwidth W Hz and a durability of T seconds could not contain more than 2WT distinguishable pulses. This was later reformulated into what is today known as the sampling theorem. This is an important concept of all systems converting between discrete and analog representations of signals and is widely used in signal processing and communication theory. In 1924, Hartley was first to propose a measure of information in his paper “Transmission of information” [4]. In this study, he realized that the limiting factor in communication is the noise. Without noise there would not be any problem. And then, 20 years later, Claude Shannon further developed the concepts and concluded that a measure of information must be based on probability theory.
The idea is that without choices there is no information. If a variable can have only one outcome, it does not give any information to an observer. If there are multiple outcomes, their presence is determined by their probabilities, and the information measure is therefore derived from the distribution on which the symbol is generated. That means the developed information measures are purely probabilistic functions. At first glance, it might be surprising that the information measure developed, and on which communication systems still rely, does not have anything to do with the content of a sequence. The measure can instead be interpreted as the amount of information required to determine, or reproduce, the sequence. In that perspective, a completely random sequence also contains a lot of information, even though it does not have a meaning to us. The strength in this view is that the theory is valid for all types of information sources and thus all types of communications.
Information theory deals with two closely related terms: information and communication. As stated above, the information measure will depend on the amount of information needed to reproduce the source sequence. Then, if the aim is to transport this sequence or message from a source to a destination, it is only this amount of information that is needed to be transported. In his paper, Shannon expressed that the “fundamental problem of communication is that of reproducing at one point exactly or approximately a message selected at another point.” In this rather general statement, a point can be referred to as place and time, meaning that the message selected at on place and one time should be reproduced at another place and another time.
In Figure 1.1, an example of communication is shown. It shows someone recording a video with a computer and then uploading it to a server, e.g., Youtube. Then, at later time someone else downloads the video and displays it on a screen. In this case, the stated another point refers to the time and place where the video is displayed. The video that is uploaded is typically compressed using a video compression format, e.g., H.264 or MPEG. Videos are typically compressed quite hard, using a lossy compression. When the video is downloaded and decompressed, it differs quite a lot from the raw video recorded at the source. But for the human eye and perception ability, it should look approximately the same. If the message transmitted was a text or a computer program, the same principle can be used, the text is saved on a server and downloaded at a later time. But in case of a program, the reconstruction needs to be an exact copy of the uploaded file. This depicts the difference between the statement exactly or approximately in the cited case above.
Figure 1.1 A communication system.
In the world of information theory and communication theory, Figure 1.1, and most other communication situations, is better represented by Figure 1.2. There the source is the recording of the (raw) video that gives the sequence X. To save disk space as well as uploading and downloading time, the source sequence X should be compressed through a source encoder. The aim of the source encoder is to extract the pure information, and in the case of lossy compression a representation with less information gives approximately the same experience as the raw video. Since pure information is very sensitive to errors in transmission and storage, it is protected by a channel code. The purpose of this code is to introduce redundancy such that the channel decoder can correct errors. In the case of the video uploading to a data center, there are several occasions of encoding and decoding. First when the file is uploaded, there is encoding and decoding in the transmission protocols. At the data center, there are often error-correcting codes, protecting from hard drive failure. At downloading, there is again the transmission protocol using error-correcting codes. Finally at the destination, the compressed sequence is decompressed and sent to the screen.
Figure 1.2 Shannon’s model for a communication system.
The aim of this text is to give an introduction to the information theory and communication theory. It should give an understanding of the information measures and the bounds given by them, e.g., bounds on source coding and channel coding. It also describes source coding and channel coding on the basis of some well-known algorithms. Chapter 2 gives a short review of probability theory, where most of the basic theory needed in the text is presented. The intention is to present a quick and handy lookup of the theory regarding probabilities. Even though there are attempts to explain intuitively the theory, most of the mathematical proofs are omitted. For a more thorough treatment of the subject, the reader should refer to the literature, e.g., [5, 6]. The chapter is complemented by an appendix where some of the most common distributions are shown, both discrete and continuous.
In Chapter 3, the information measures used in the theory are defined. It also presents their relations, and how they can be interpreted. In Chapter 4, the measures are used to set up bounds on source coding. For the case of vectors with identical and independently distributed (i.i.d.) variables, a simple, yet optimal, code construction due to Huffman is given. Even though Huffman’s code construction gives optimal codes for the i.i.d. case, in reality sources for compression are seldom independent variables and the distribution is seldom known. In Chapter 5, adaptive compression methods are described. These include the widely used the Lempel–Ziv algorithm, that is used in, for example, zip, png, and other well-known standards.
In Chapter 6, the concept of asymptotic equipartition property is presented, which is a consequence of the law of large numbers. It will give the possibility to show two of the most important theorems on information theory, i.e., the source-coding theorem and the channel-coding theorem. In this chapter, the channel capacity will also be introduced as the possible amount of information carried in a transmission over a noisy channel. In Chapter 7, the concepts of error-correcting codes are discussed, introducing both block codes and convolutional codes.
In Chapter 8, the information measures defined in Chapter 3 for discrete variables are extended to the continuous case. The relation between the interpretations for the measures in the discrete and continuous cases is discussed. Depending on the logical level of the channel model, it can be discrete or continuous. The discrete channels were treated in Chapter 6, whereas in Chapter 9 the continuous counterpart is discussed, with a special focus on the case for Gaussian noise. To reach the channel capacity for this case, it turns out that the transmitted variable should be Gaussian distributed, whereas in reality it is often both discrete and uniformly distributed. This case is treated specially in Chapter 10, where a closer relationship is presented with practically used modulation schemes like pulse amplitude modulation and quadrature amplitude modulation.
In Chapter 11, the concept of distortion is introduced. In reality, it is often acceptable with a certain amount of distortion, e.g., in image and video coding the human eye will tolerate distortions and in a communication scenario there might be tolerance for few errors in the transmission. This is incorporated in the previous theory by using the rate distortion functions. The chapter is concluded with a description of lossless compression and an introduction to transform decoding, used in, e.g., jpeg.
Over the years there has been several books written in the subject of information theory, e.g., [11, 13, 67, 71, 80, 81, 84, 86, 92]. The aim of this book is to present the theory for both discrete and continuous variables. It also aims at applying the theory towards communication theory, which is especially seen in Chapter 10 where discrete input Gaussian channels are considered.
