100,95 €
Distributed source coding is one of the key enablers for efficient cooperative communication. The potential applications range from wireless sensor networks, ad-hoc networks, and surveillance networks, to robust low-complexity video coding, stereo/Multiview video coding, HDTV, hyper-spectral and multispectral imaging, and biometrics.
The book is divided into three sections: theory, algorithms, and applications. Part one covers the background of information theory with an emphasis on DSC; part two discusses designs of algorithmic solutions for DSC problems, covering the three most important DSC problems: Slepian-Wolf, Wyner-Ziv, and MT source coding; and part three is dedicated to a variety of potential DSC applications.
Key features:
The book provides fundamental knowledge for engineers and computer scientists to access the topic of distributed source coding. It is also suitable for senior undergraduate and first year graduate students in electrical engineering; computer engineering; signal processing; image/video processing; and information theory and communications.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 491
Veröffentlichungsjahr: 2017
Cover
Title Page
Copyright
Preface
Acknowledgment
About the Companion Website
Chapter 1: Introduction
1.1 What is Distributed Source Coding?
1.2 Historical Overview and Background
1.3 Potential and Applications
1.4 Outline
Part I: Theory of Distributed Source Coding
Chapter 2: Lossless Compression of Correlated Sources
2.1 Slepian–Wolf Coding
2.2 Asymmetric and Symmetric SW Coding
2.3 SW Coding of Multiple Sources
Chapter 3: Wyner–Ziv Coding Theory
3.1 Forward Proof of WZ Coding
3.2 Converse Proof of WZ Coding
3.3 Examples
3.4 Rate Loss of the WZ Problem
Chapter 4: Lossy Distributed Source Coding
4.1 Berger–Tung Inner Bound
4.2 Indirect Multiterminal Source Coding
4.3 Direct Multiterminal Source Coding
Part II: Implementation
Chapter 5: Slepian–Wolf Code Designs Based on Channel Coding
5.1 Asymmetric SW Coding
5.2 Non-asymmetric SW Coding
5.3 Adaptive Slepian–Wolf Coding
5.4 Latest Developments and Trends
Chapter 6: Distributed Arithmetic Coding
6.1 Arithmetic Coding
6.2 Distributed Arithmetic Coding
6.3 Definition of the DAC Spectrum
6.4 Formulation of the Initial DAC Spectrum
6.5 Explicit Form of the Initial DAC Spectrum
6.6 Evolution of the DAC Spectrum
6.7 Numerical Calculation of the DAC Spectrum
6.8 Analyses on DAC Codes with Spectrum
6.9 Improved Binary DAC Codec
6.10 Implementation of the Improved BDAC Codec
6.11 Experimental Results
6.12 Conclusion
Chapter 7: Wyner–Ziv Code Design
7.1 Vector Quantization
7.2 Lattice Theory
7.3 Nested Lattice Quantization
7.4 WZ Coding Based on TCQ and LDPC Codes
Part III: Applications
Chapter 8: Wyner–Ziv Video Coding
8.1 Basic Principle
8.2 Benefits of WZ Video Coding
8.3 Key Components of WZ Video Decoding
8.4 Other Notable Features of Miscellaneous WZ Video Coders
Chapter 9: Correlation Estimation in DVC
9.1 Background to Correlation Parameter Estimation in DVC
9.2 Recap of Belief Propagation and Particle Filter Algorithms
9.3 Correlation Estimation in DVC with Particle Filtering
9.4 Low Complexity Correlation Estimation using Expectation Propagation
Chapter 10: DSC for Solar Image Compression
10.1 Background
10.2 Related Work
10.3 Distributed Multi-view Image Coding
10.4 Adaptive Joint Bit-plane WZ Decoding of Multi-view Images with Disparity Estimation
10.5 Results and Discussion
10.6 Summary
Chapter 11: Secure Distributed Image Coding
11.1 Background
11.2 System Architecture
11.3 Practical Implementation Issues
11.4 Experimental Results
11.5 Discussion
Chapter 12: Secure Biometric Authentication Using DSC
12.1 Background
12.2 Related Work
12.3 System Architecture
12.4 SeDSC Encrypter Design
12.5 SeDSC Decrypter Design
12.6 Experiments
12.7 Discussion
Appendix A: Basic Information Theory
A.1 Information Measures
A.2 Independence and Mutual Information
A.3 Venn Diagram Interpretation
A.4 Convexity and Jensen's Inequality
A.5 Differential Entropy
A.6 Typicality
A.7 Packing Lemmas and Covering Lemmas
A.8 Shannon's Source Coding Theorem
A.9 Lossy Source Coding—Rate-distortion Theorem
Appendix B: Background on Channel Coding
B.1 Linear Block Codes
B.2 Convolutional Codes
B.3 Shannon's Channel Coding Theorem
B.4 Low-density Parity-check Codes
Appendix C: Approximate Inference
C.1 Stochastic Approximation
C.2 Deterministic Approximation
Appendix D: Multivariate Gaussian Distribution
D.1 Introduction
D.2 Probability Density Function
D.3 Marginalization
D.4 Conditioning
D.5 Product of Gaussian pdfs
D.6 Division of Gaussian pdfs
D.7 Mixture of Gaussians
D.8 Summary
Appendix: Matrix Equations
Bibliography
Index
End User License Agreement
1
2
3
4
5
7
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
75
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
319
320
321
322
323
324
325
326
327
328
329
331
332
333
334
335
336
337
338
339
340
341
342
343
344
357
358
359
360
361
Cover
Table of Contents
Preface
Begin Reading
Chapter 1: Introduction
Figure 1.1 DSC concept with two separate encoders who do not talk to each other and one joint decoder. and are discrete, correlated sources; and are compression rates.
Chapter 2: Lossless Compression of Correlated Sources
Figure 2.1 Classical SW setup with two separately encoded sources and a joint decoder.
Figure 2.2 Four bins of a network of temperature sensors.
Figure 2.3 SW rate region.
Figure 2.4 SW network with multiple sources.
Chapter 3: Wyner–Ziv Coding Theory
Figure 3.1 Setup of the WZ problem.
Figure 3.2 Time sharing of two coder pairs results in a couple below the linear interpolation of the two end couples (shown as * in the figure). will be smaller than the time-sharing rate since is minimized over all possible distributions.
Figure 3.3 Rather than reconstructing immediately, we could first construct an intermediate auxiliary output at the decoder. If enough information is passed through , we can ensure that , , and always have the same statistics as sequences sampled from and consequently we will have as desired.
Figure 3.4 An auxiliary random variable is generated by passing input through a binary symmetric channel with crossover probability .
Figure 3.5 WZ limit of doubly binary symmetric source. is shown as the dashed-dotted curve, and the dark solid curve, , is the lowest convex hull of and . In particular, on curve is obtained through time sharing (linear combination) between on curve and (i.e., when no rate is transmitted). It can be shown that is the actual WZ limit. In contrast, the light solid curve is the theoretical limit when side information is available at both encoder and decoder. It is clear that there is
rate loss
in this case.
Figure 3.6 For joint Gaussian zero-mean sources and , we can express as , where is some constant and is a zero-mean Gaussian random variable independent of . To construct the auxiliary random variable used in the forward proof of WZ coding, we write as , where is another zero-mean Gaussian random variable independent of . It turns out that such construction is optimum, as shown in this subsection.
Chapter 4: Lossy Distributed Source Coding
Figure 4.1 The Berger–Tung scheme.
Figure 4.2 The CEO problem (indirect multiterminal source coding coding problem) with two managers (encoders).
Figure 4.3 . Note that the region appears to overlap with that of (or equivalently ). This strongly suggests that we should consider the case and separately.
Figure 4.4 Corner points of the Berger–Tung region of a fixed and fall right on the boundary of and .
Figure 4.5
. Corner point of the boundary is achieved by picking
and
. Other boundary points can be achieved by picking
such that
.
Figure 4.6 . Corner points of the boundary are achieved by picking satisfying (4.16) and (4.17) (black dashed line boundary in the middle). Other boundary points can be achieved by picking some such that (darker gray dashed line boundary on the left) and some other such that (lighter gray dashed line boundary on the right).
Figure 4.7 The sum rate bounds for the Gaussian multiterminal source coding problems (, , ).
Chapter 5: Slepian–Wolf Code Designs Based on Channel Coding
Figure 5.1 The block diagram of SW coding. Side information can be treated as the source passing through a hypothetical channel.
Figure 5.2 Factor graphs for encoding and decoding using LDPC-based SW coding.
Figure 5.3 Non-asymmetric SW for two length-12 binary sources (a) and (b) using an IRA code, where the compressed bits, and , are circled.
Figure 5.4 Factor graph for non-asymmetric SW decoding.
Figure 5.5 Factor graph for adaptive SW decoding. Note that the filled variable nodes (circles) are known but the blank variable nodes are unknown and need to be estimated.
Figure 5.6 An example of systematic resampling with five samples.
Chapter 6: Distributed Arithmetic Coding
Figure 6.1 Representing a symbol using an interval. is partitioned into regions with lengths proportional to the probabilities of the symbols. Any interval (which in turn can be represented by a number) that lies within the region corresponding to a symbol can be used to generate the codeword of the symbol.
Figure 6.2 An example of encoding (left) and decoding (right) of AC. A sequence of symbols is compressed into a bitstream .
Figure 6.3 An example of encoding (left) and decoding (right) of distributed AC. A sequence of symbols is compressed into the bitstream . The symbols that cannot be determined immediately during the decoding process are shown in bold. Only the correct decoding “route” is shown.
Figure 6.4 Scores (log-probabilities) computed during distributed arithmetic decoding. The higher scores correspond to higher probabilities. A node is split when the symbol cannot be determined unambiguously.
Figure 6.5 An example of DAC decoding tree, where .
Figure 6.6 Illustrations of the relations between , , and . Both and have the same shape as . is obtained by scaling by times along the -axis. is obtained by shifting right by .
Figure 6.7 Illustration of DAC spectrum evolution. At stage- nodes, if , a 0-branch will be created and then , will be mapped onto , at stage- nodes; if , a 1-branch will be created and then , will be mapped onto , at stage- nodes. Thus, will be the normalized sum of and .
Figure 6.8 Examples of numerical approximates to . (a) Shape change of with respect to when = . (b) Three examples for . (c) Three examples for . (d) Three examples for .
Figure 6.9 Theoretical and experimental results of the expansion factor for , 0.7, and 0.8.
Figure 6.10 C++ implementation of the PBDAC encoder for equiprobable binary sources.
Figure 6.11 C++ implementation of the PBDAC+WB decoder for equiprobable binary sources.
Figure 6.12 C++ implementation of the branch function.
Figure 6.13 The -axis is the residual FER and the -axis is the crossover probability between source and side information, varying from 0.05 to 0.09 with increment 0.01. Each data point is the average over 1000 trials. (a) Effect of on the permutation technique. Other codec parameters are fixed as . (b) Effect of on the WB technique. Other codec parameters are fixed as . (c) Comparison of four DAC codecs with two LDPC codecs. The DAC codec parameters are fixed as . (d) Application of PBDAC to nonuniform sources. The codec parameters are fixed as .
Chapter 7: Wyner–Ziv Code Design
Figure 7.1 Systematic framework of WZ codec.
Figure 7.2 Simulated pairs out of a memoryless Gaussian source with zero mean and unit variance. One can see that the distribution of a pair of quantization indices will be very skewed. For example, a pair will be unlikely to quantize to the index pair corresponding to the quantization point .
Figure 7.3 Two lattices in . (a) Square 2D lattice , where and . (b) Hexagonal 2D lattice, where and . In this case, for all . An alternative basis is , where .
Figure 7.4 2D space division with the hexagonal 2D lattice, where and . Each point is an element of the hexagonal 2D lattice. Each hexagon around a point is a Voronoi cell and the shaded hexagon is the basic Voronoi cell .
Figure 7.5 Examples of packing radius , covering radius , and effective radius for hexagonal 2D lattice.
Figure 7.6 Example of 2D nested lattices based on a hexagonal 2D lattice.
Figure 7.7 Example of 1D nested lattice pair to illustrate quantization loss and binning loss.
Figure 7.8 Uniform codebook and partition for 2-bit TCQ. There are four sub-codebooks, each with two codewords. The source symbols to be quantized are phases. Due to the periodicity of phases, that is, , , we let and , thus , , and . It is easy to verify , , so , , , and are replaced by , , , and , respectively.
Figure 7.9 Diagram of a recursive, systematic convolutional encoder with memory 2 and rate 1/2, where each rectangle containing the letter D represents a delay component. For each input bit , two bits are output: .
Figure 7.10 Trellis generated from Figure 7.9, including four states. Starting from each state, there are two branches, each labeled by , where is the input bit, while are the output bits depending on the input bit and the preceding state. Solid arrow lines are for 0-branches and dashed arrow lines for 1-branches. Branches with output bits “00”, “01”, “10”, and “11” will use sub-codebooks , , , and , respectively.
Figure 7.11 Searching for the best path (i.e., thick black arrow lines) with the smallest total accumulated metric by the Viterbi algorithm. At each stage, for each state, only the incoming branch with smaller partial accumulated metric (black lines) is retained, while the other (gray arrow lines) is cut off. Each branch along the optimal path is labeled by its metric (Table 7.2). Solid arrow lines are for 0-branches and dashed arrow lines for 1-branches.
Figure 7.12 Block diagram of the SW coding–TCQ scheme.
Figure 7.13 bit-planes of TCQ index vector .
Figure 7.14 Partition of the real line into mini-cells.
Figure 7.15 An example of generated from a 2-bit TCQ with quantization stepsize 0.575 for . Dashed lines mark the centroids used in the quantizer decoder.
Figure 7.16 Multilevel SW coding.
Chapter 8: Wyner–Ziv Video Coding
Figure 8.1 A simplified block diagram of a typical conventional video coding scheme.
Figure 8.2 A typical WZ video coding scheme.
Figure 8.3 Comparison of different motion compensation approaches of the Carphone sequence. Upper left: target image; upper right: encoder motion search; lower left: extrapolation (using motion vectors from previously decoded frames); lower right: bidirectional motion compensation.
Figure 8.4 Prediction error of the Carphone sequence (left) and the Foreman sequence (right). The dotted lines are the corresponding fitted models with Laplace distributions.
Chapter 9: Correlation Estimation in DVC
Figure 9.1 Particle filtering algorithm.
Figure 9.2 The workflow of a WZ decoder with OTF correlation estimation.
Figure 9.3 The WZ decoder with OTF correlation estimation.
Figure 9.4 Residual histogram (DC band of Foreman at 15 Hz) in comparison with Laplacian distribution ().
Figure 9.5 PSNR comparison of the PBP joint bit-plane DVC for the QCIF Soccer sequence, compressed at 15 fps.
Figure 9.6 PSNR comparison of the PBP joint bit-plane DVC for the QCIF Coastguard sequence, compressed at 15 fps.
Figure 9.7 PSNR comparison of the PBP joint bit-plane DVC for the QCIF Foreman sequence, compressed at 15 fps.
Figure 9.8 PSNR comparison of the PBP joint bit-plane DVC for the QCIF Hall Monitor sequence, compressed at 15 fps.
Figure 9.9 PSNR comparison of the PBP joint bit-plane DVC for the QCIF Salesman sequence, compressed at 15 fps.
Figure 9.10 Frame-by-frame PSNR variance for Soccer sequence with quantization matrix .
Figure 9.11 Estimation accuracy of the OTF estimation method for the AC band of Soccer sequence.
Figure 9.12 Factor graph of joint bit-plane SW decoding with correlation estimation.
Figure 9.13 PSNR comparison of the EP-based OTF and pre-estimation DVC for the QCIF Carphone sequence, compressed at 15 fps.
Figure 9.14 PSNR comparison of the EP-based OTF and pre-estimation DVC for the QCIF Foreman sequence, compressed at 15 fps.
Figure 9.15 PSNR comparison of the EP-based OTF and pre-estimation DVC for the QCIF Soccer sequence, compressed at 15 fps.
Figure 9.16 Subframe-by-subframe rate variance for the Soccer sequence with quantization bits equal to 3.
Figure 9.17 Estimation accuracy of the EP-based OTF DVC for the correlation parameter of the Soccer sequence.
Chapter 10: DSC for Solar Image Compression
Figure 10.1 Lossy DMIC setup with disparity and correlation estimation.
Figure 10.2 Factor graph of a joint bit-plane decoder with disparity and correlation estimation.
Figure 10.3 Empirical residual histogram for solar images in SET 1 in comparison of Laplace distribution ().
Figure 10.4 Rate-distortion performance of the proposed adaptive DMIC scheme for solar images in SET 1.
Figure 10.5 Rate-distortion performance of the proposed adaptive DMIC scheme for solar images in SET 2.
Chapter 11: Secure Distributed Image Coding
Figure 11.1 The workflow of the proposed framework.
Figure 11.2 Illustration of DSC-based compression of an encrypted CT image slice. The original slice (a) is encrypted into the slice shown in (b) using a stream cipher. The encrypted slice is then compressed using the DSC method into that shown in (c) with much smaller size.
Figure 11.3 Factor graph for decompression of compressed encrypted data.
Figure 11.4 Residual histogram for slice sequences of a CT image, where the Laplace distribution with is shown as reference.
Figure 11.5 The first and the last slices in CT image set 1 ((a) and (b), respectively) and set 2 ((c) and (d), respectively).
Figure 11.6 Examples of (a) partitioned CT image slice, (b) encryption key, (c) encrypted CT image, and (d) code rates of each partition using the SUPERMICRO framework, where the grids in (a), (b), and (c) partition the whole slices into sub-slices and the average code rate in (d) is for the given slice.
Figure 11.7 Code rate versus the different number of encoded bits for both CT image set 1 (i.e., black dash-circle line) and set 2 (i.e., gray dash-dot line) using the SUPERMICRO system.
Figure 11.8 Code rate versus different CT image slices for image set 1, which compared three different setups: the SUPERMICRO on encrypted slices, JPEG 2000 lossless compression on both original slices, and encrypted slices.
Figure 11.9 Code rate versus different CT image slices for image set 2, which compared three different setups: the SUPERMICRO on encrypted slices, JPEG 2000 lossless compression on both original slices, and encrypted slices.
Figure 11.10 Code rate versus different GOS sizes of 6, 12, 25, 50, and 100 for image sets 1 and 2.
Chapter 12: Secure Biometric Authentication Using DSC
Figure 12.1 Example of several possible attack points of a fingerprint-based biometric system, as identified by Ratha
et al.
[132–134].
Figure 12.2 The workflow of the PROMISE framework. Four subsequent steps in PROMISE are: (1) alignment and feature extraction, (2) feature pre-encryption, (3) SeDSC encrypter/decrypter, and (4) privacy-preserving authentication.
Figure 12.3 Non-asymmetric SW for binary features and using an IRA code, where the encrypted (compressed) bits and are circled.
Figure 12.4 The mutual information of original feature and pre-encrypted feature (i.e., ), original feature and pre-encrypted feature (i.e., ), original feature pair (i.e., ), pre-encrypted feature pair (i.e., ), cipher key of feature and combined cipher key (i.e., ), and cipher key of feature and combined cipher key (i.e., ) in a box plot based on 100 samples.
Figure 12.5 Factor graph for non-asymmetric SW decoding.
Figure 12.6 Correlations in terms of crossover probability between enrollment () and verification () features versus different feature length: (a) intra-user correlation and (b) inter-user correlation.
Figure 12.7 ROC curves of non-asymmetric SeDSC and asymmetric SeDSC setups with the top 100 most discriminable features.
Figure 12.8 Authentication performances (i.e., GAR) of both non-asymmetric and asymmetric SW-based SeDSCs with different feature length and code rate, where GAR points with nonzero FARs have been specifically indicated.
Figure 12.9 Intra-user and inter-user correlation before non-asymmetric SW encryption (a) and after non-asymmetric SW encryption (b).
Figure 12.10 Comparison of complexities between non-asymmetric and asymmetric SW-based SeDSCs in terms of average execution time per match in milliseconds (ms).
Appedix A: Basic Information Theory
Figure A.1 The entropy of a binary source.
Figure A.2 Relationship between entropies and mutual information.
Figure A.3 Venn diagram with relationships of information measures among three variables.
Appedix B: Background on Channel Coding
Figure B.1 Packing of a channel code. Dark circles and light circles represent codeword and noncodeword vectors, respectively. Each cell includes code vectors that can be corrected to the codeword in the cell.
Figure B.2 Different representations of a convolutional code.
Figure B.3 Encoding of convolutional code. The input is
and the encoded message is
.
Figure B.4 The main idea of Viterbi decoding. The thick black line is the shortest path, which passes through an intermediate state
. The sub-path from state
to the starting node also needs to be a shortest path. Therefore, only the shortest path to the current state
needs to be stored and the rest can be safely discarded.
Figure B.5 The Tanner graph of the (7,4) Hamming code described in (B.2).
Figure B.6 The factor graph of the inference example and the direction of messages passed by the BP algorithm.
Figure B.7 LDPC decoding. The decoder estimates initial probabilities
for each bit.
Figure B.8 Tanner graph for an IRA code.
Figure B.9 IRA decoding. Exactly the same BP decoding can be applied. Parity bits are treated the same as systematic bits.
Appedix D: Multivariate Gaussian Distribution
Figure D.1 The conditional pdf
if
and
, where
is independent of
and
is independent of
.
Figure D.2 The pdf of a mixture of Gaussians (
).
Figure D.3 Likelihood functions:
.
Figure D.4 Approximate
by discarding the smallest weight components (
) and merging similar components (
). The latter approximation does so well that
and
essentially overlap each other.
Chapter 2: Lossless Compression of Correlated Sources
Table 2.1 Joint probability distribution.
Chapter 7: Wyner–Ziv Code Design
Table 7.1 Some of the best lattices
Table 7.2 Nearest codewords and corresponding metrics
Chapter 9: Correlation Estimation in DVC
Table 9.1 Message passing algorithm jointly updating inference on source and correlation variance.
Table 9.2 Average resulting Bjontegaard delta PSNR and corresponding bitrate for sequences Foreman, Soccer, Coastguard, Hall, and Salesman
Table 9.3 Execution time (full sequence in seconds) of joint bit-plane online, joint bit-plane PBP codec vs. DISCOVER codec
Table 9.4 Expectation propagation
Table 9.5 H.264/AVC quantization parameter for different video sequences
Chapter 12: Secure Biometric Authentication Using DSC
Table 12.1 List of symbols
ShuangWang, Yong Fang, and Samuel Cheng
This edition first published 2017
© 2017 John Wiley & Sons Ltd
Registered office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of Shuang Wang, Yong Fang, and Samuel Cheng to be identified as the authors of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought
Library of Congress Cataloging-in-Publication data applied for
ISBN: 9780470688991
A catalogue record for this book is available from the British Library.
Cover design by Wiley
Cover image: Top numbers image: a_Taiga/Gettyimages
Bottom molecular image: KTSDESIGN/SCIENCE PHOTO LIBRARY/Gettyimages
This book intends to provide fundamental knowledge for engineers and computer scientists to get into the topic of distributed source coding, and is targeted at senior undergraduate or first-year graduate students. It should also be accessible for engineers with a basic knowledge of calculus and applied probability. We have included short chapters on information theory, channel coding, and approximate inference in the appendix. The goal is to make the content as self-contained as possible.
Excluding the appendix, the book is divided into three parts. First-time readers who would like to get into applications quickly should work through Chapters 1, 2, 3, 5, and 7, and then go straight to Part III, the application chapters. Chapters 4 and 6 are more advanced and may require more math maturity from readers.
Distributed source coding is an active research area and the authors have not been able to cover every aspect of it. We have included materials that are geared more toward our own research interests. However, we have striven to include the most interesting applications in this book.
Book writing is a long endeavor. It is especially so in this case. This book project grew out of several distributed source coding tutorials presented in the US, Taiwan and Switzerland around 2009 by Dr. Vladimir Stankovic, Dr. Lina Stankovic, and the third author. Many materials in this book reflect the immense influence of these tutorials. Drs. Stankovic have been very kind in offering extensive unconditional help and allowed us to include many of their ideas and even their own words in this book. For this we wish to express our most sincere and deepest gratitude.
The third author would also like to express special thanks to his Master and PhD advisor, Dr. Zixiang Xiong, who taught him not just distributed source coding, but also everything relating to research and academia.
Over the years, the authors have received help from many colleagues, friends, and students, including Angelos Liveris, Costas Georghiades, Yang Yang, Zixin Liu, Zongmin Liu, Qian Xu, Shengjie Zhao, Nikoloas Deligiannis, Xiaoqian Jiang, Adrian Munteanu, Peter Schelkens, Andrei Sechelea, Feng Chen, and Lijuan Cui. In particular, we would like to offer special thanks to Liyan Chen and Xiaomin Wang for generating some of the figures. We could not possibly include everyone in the list above, and we would like to ask for forgiveness for any omission. Finally, the authors would like to acknowledge support from the National Science Foundation, the National Institutes of Health (NHGRI K99HG008175), and the National Nature Science Foundation of China.
Don't forget to visit the companion website for this book:
www.wiley.com/go/cheng/dsctheoryandpractice
There you will find valuable material designed to enhance your learning, including:
Example codes
Scan this QR code to visit the companion website
Imagine a dense sensor network consisting of many tiny sensors deployed for information gathering. Readings from neighboring sensors will often be highly correlated. This correlation can be exploited to significantly reduce the amount of information that each sensor needs to send to the base station, thus reducing power consumption and prolonging the life of the nodes and the network. The obvious way of exploiting correlation is to enable neighboring sensors to exchange data with one another. However, communication among sensors is usually undesirable as it increases the complexity of the sensors, which in turn leads to additional cost and power consumption. How then is it possible to avoid information exchange among sensors but still be able to exploit the statistical dependency of the readings in different sensor nodes? The solution lies in distributed source coding.
Distributed source coding (DSC) enables correlation to be exploited efficiently without the need for communications among sensors. Moreover, in some specific cases it does not incur any loss compared to the case when the sensors communicate.
DSC has been receiving significant attention since the beginning of the 21st century from academics and industrial researchers in different fields of electrical and computer engineering, and mathematical and computer science. Indeed, special sessions and tutorials dedicated to DSC are given at major communications, signal processing, multimedia, and computer engineering conferences. This is no wonder as DSC has potential applications ranging from wireless sensor networks, ad-hoc networks, surveillance networks, to robust low-complexity video coding, stereo/multiview video coding, high-definition television, and hyper-spectral and multi-spectral imaging. This book is intended to act as a guidebook for engineers and researchers to grasp the basic concepts quickly in order to understand and contribute to this exciting field and apply the emerging applications.
DSC deals with the source coding or compression of correlated sources. The adjective stresses that the compression occurs in a distributed or noncentralized fashion. We can, for example, assume that the sources to be compressed are distributed across different nodes in a network. The task is to compress these sources and communicate compressed streams to a decoder for joint decompression. The basis of DSC is that the compressions take place , that is, the nodes do not exchange their information, whereas decompression is joint.
This is illustrated in Figure 1.1 for the simple case of two nodes. Eachnode has access only to its source, and does not have information about sources present at other nodes. Therefore, “distributed” in this context refers to separate compression at each node. Note that if decompression were also separate for each of the sources, then the problem would boil down to multiple conventional compressions. Throughout the book, distributed compression will always refer to separate encoding and joint decoding, whereas joint compression will refer to joint encoding and joint decoding, if not stated otherwise.
Figure 1.1 DSC concept with two separate encoders who do not talk to each other and one joint decoder. and are discrete, correlated sources; and are compression rates.
The search of achievable rates of DSC is a major information-theoretical problem and lies in the framework of network information theory, a branch of information theory that tries to find the compression and communication limits of a network of nodes. In the case of discrete sources and perfect reconstruction at the decoder, DSC extends Shannon's Source Coding Theorem, in information theory, from the point-to-point to multipoint scenario. This is referred to as lossless DCS. When we allow for some distortion in reconstruction, the DSC problem becomes a rate-distortion problem and is referred to as lossy DSC.
DSC started as an information-theoretical problem in the seminal 1973 paper of Slepian and Wolf [1]. Slepian and Wolf considered lossless separate compression of two discrete sources, and showed that roughly speaking there is no performance loss compared to joint compression as long as joint decompression is performed.
This remarkable result triggered significant information-theoretical research resulting in solutions – in the form of achievable rate regions – for more complicated lossless setups. In 1976, Wyner and Ziv [2] considered a lossy version, with a distortion constraint, of a special case of the Slepian–Wolf (SW) problem, where one source is available at the decoder as side information. Wyner and Ziv showed that for a particular correlation where source and side information are jointly Gaussian, there is no performance loss due to the absence of side information at the encoder. The lossy case of the generalized SW setup, known as multiterminal (MT) source coding, was introduced by Berger and Tung in 1977 [3, 4].
A possible realization of DSC via the use of conventional linear channel codes to approach the SW bound was known as early as 1973, but due to the lack of any potential application of DSC, work on code designs, that is, how to code the sources to approach given bounds, started only at the end of the last century. The first practical design was reported in 1999 [5], followed by many improved solutions. One key insight of these designs is that conventional channel coding can be used for compression. Indeed, correlation between the sources is seen as a virtual communication channel, and as long as this virtual channel can be modeled by some standard communication channel, for example Gaussian, channel codes can be effectively employed. Capacity-approaching designs [6] based on quantization followed by advanced channel coding, for example with turbo codes [7] and low-density parity-check (LDPC) codes [8], come very close to the bounds for two jointly Gaussian sources.
The launch of wireless sensor networks (WSNs) ignited practical DSC considerations in the early years of this century since WSNs naturally call for distributed processing. Closely located sensors are expected to have correlated measurements; thus in theory the DSC setup fulfills the requirement of power-efficient compression for distributed sensor networks. However, many practical problems remain to be solved before DSC is used in mainstream commercial networks. The challenges include the complex correlation structure of real signals, nonGaussian sources, mandatory long codeword lengths, and the complexity of current designs.
Though WSN triggered renewed interest in DSC, another application has emerged: low-complexity video, where the DSC paradigm is used to avoid computationally expensive temporal prediction loop in video encoding. Indeed, loosely speaking, a conventional video encoder needs to find the best matching block to the current one by examining all possible candidates, for example a previous frame. Then, the difference between the two blocks is encoded and motion vectors sent to the decoder. This architecture imposes heavy encoding and light decoding, exactly what is needed for television broadcasting! However, the fact that compression performance is not degraded by the lack of side information at the encoder [2] brings an incredible possibility of avoiding the most resource-consuming motion search at the encoder and shifting it to the decoder side. This idea had remained for more than 20 years in a US Patent [9]. It was only recently with the developed DSC designs that Wyner–Ziv (WZ) or distributed video coding was revisited.
Distributed video coding (DVC) was rediscovered in 2002 [10, 11, 23, 150, 151] and taken forward by many researchers. DVC was proposed as a solution for unconventional setups, such as video surveillance with tiny cameras and cell-to-cell communications, where low encoding complexity is a must. However, other criteria (e.g., the inbuilt robustness of DVC coders to error impairments as there is no error accumulation due to the mismatch of reference frame1 and a unique opportunity of using a single code for both compression and error protection) have influenced the design philosophy of DVC for broader applications such as robust scalable video transmission over wireless networks and video streaming from multiple sites. Yet another application is multiview and stereo video, where DSC is used to exploit the correlation between different camera views when encodings must be independent.
DVC has also been used to exploit correlation between different bands in multi-spectral and hyper-spectral image coding, then in biometrics, and for compression in microphone array networks. And other applications are emerging, such as spectrum sensing in cognitive radio, compress-and-forward in cooperative communications, and more. These will be discussed in more detail later in the book.
While some might prefer to think that DSC is driven by applications, others are fascinated by the information-theoretical challenges it poses. Thus, this book attempts to reconcile both schools of thought by discussing practical code designs and algorithms, together with the theoretical framework for DSC as well as the potential applications various DSC setups have and will impact. The authors attempt to strike a balance between mathematical rigor and intuition. This book is intended to be a guide for both senior/graduate students and engineers who are interested in the field. We have tried our best to ensure the manuscript is self-contained. While we assume no background knowledge in information and coding theory, we expect the reader to be familiar with calculus, probability theory, and linear algebra as taught in a first- or second-year undergraduate course. The book comprises working examples where possible and includes references to further reading material for more advanced or specialized topics where appropriate.
The book is organized into three parts: Theory, Algorithms, and Applications of DSC.
Part I, Theory, starts with a brief and essential background of information theory and data compression. The background is followed by a comprehensive review of the roots of DSC, namely Slepian and Wolf's theorem and its extension to multiple sources and lossless MT networks. Next, rate-distortion problems will be visited, that is, WZ and lossy MT source coding problems.
Part II, Algorithms, introduces and discusses designs of algorithmic solutions for DSC problems. Most of this part is dedicated to the evolution of code designs for the three most important DSC problems, namely SW, WZ, and MT source coding, introduced in Part I. We start with a background chapter on coding theory, followed by designs for the SW coding problem and elaborate developments from the early work of Wyner in 1974 to the most recent achievements with advanced channel codes, Belief Propagation, and arithmetic codes. Similarly, for the WZ problem we start with the 1999 work of Pradhan and Ramchandran [5] and describe subsequent developments.
Part III is dedicated to potential DSC applications, grouped as multimedia applications and wireless communications applications. We start by providing the necessary background on image/video processing. DVC is discussed first, followed by biometric applications and multi-spectral/hyper-spectral image compression. The two final chapters are dedicated to wireless communications: data gathering in wireless sensor networks, spectrum sensing in cognitive radio, compress-and-forward in cooperative communications scenario, and DSC's relationship to compressive sampling.
Part III will be of great interest to practitioners. Each topic has an educational description and approaches latest developments critically, discussing their strengths and shortcomings, and suggesting possible extensions and future work.
1
Because the encoder does not use a reference frame.
In this chapter we consider the setup when the reconstructed signal after lossless compression is identical to the original input. Because the limit of compressing a source is governed by its entropy, lossless compression is often called entropy coding. The most popular entropy coding algorithms are Huffman coding, which can operate very close to the theoretical limit provided that we allow the codeword length to be arbitrarily long. Of course, one may apply entropy coding on each source independently but possible dependency or correlation among the sources is not captured in this case. In contrast, lossless DSC can offer much efficient compression. We will develop the idea of lossless DSC by means of the following application example that we will build upon throughout the chapter.
Temperature sensors scattered over a geographically small area need to report their readings regularly. All readings must be compressed and sent to a central unit for processing. Since closely located sensors will have similar readings, the problem boils down to compression of correlated sources. Each sensor can, of course, compress its measurements using any entropy coding algorithm up to the source entropy limit, but this way it will not exploit redundancy across the neighbouring sensors, and thus it cannot achieve the highest possible compression.
Lossless DSC refers to any method that independently and losslessly compresses the sources and jointly decompresses them by exploiting mutual correlation among them. This chapter will explain how this is done and outline the main information-theoretical results, including key proofs following the notation and concepts introduced in the previous chapter. The chapter answers how much we can independently compress correlated sources so that a joint decoder can still recover them with very low probability of error. We assume that all sources are discrete-time random variables taking values from finite alphabets for lossless compression. The simplest lossless DSC case is SW coding [1], where only two sources are to be compressed separately and decompressed jointly. We start with the SW coding problem, which is not only the simplest but was also the first considered distributed compression problem. Towards the end of the chapter, an extension to multiple sources is given.
In 1973 Slepian and Wolf considered the separate compression and joint losslessdecompression of two discrete sources [1]. This is the simplest DSC problem with only two encoders and one decoder, as shown in Figure 2.1. As this chapter focuses on lossless compression, the sources need to be reconstructed at the decoder exactly, without any distortion, with probability approaching one.
Figure 2.1 Classical SW setup with two separately encoded sources and a joint decoder.
Let and be two random variables denoting two correlated discrete-time memoryless sources drawn i.i.d. from joint probability distribution . Let and be the marginal probability distribution of and , respectively. The encoders of and will independently convert length- sequences and into symbols and , respectively, whereas a joint decoder will receive both and and convert them back to the estimate of the sequences and . Denote and as the alphabet of and , respectively. Then, the code rates for the sources and , and , are given by and , respectively. For example, if we compress samples at code rate bits per sample, then we would have bits to send.
From Shannon's Source Coding Theorem (see Theorem A.2 in Appendix A), it follows that each and can be compressed at a rate higher than their entropy. Thus, if we have separate decoding, the total compression rate will be , regardless of whether joint or separate encoding takes place. If the two sources are compressed and decompressed together, the total rate will be , which also directly follows from Shannon's Source Coding Theorem by taking as a single source to be compressed.
The question Slepian and Wolf wanted to answer was what is the minimum achievable rate if and are to be compressed separately and reconstructed jointly. Obviously, due to potential gain of joint decoding, the minimum needed rate has to be less than or equal to . On the other hand, having both joint encoding and decoding will certainly be at least as good as joint decoding but separate encoding, meaning that the required rate has to be larger than . If the sources are uncorrelated, from Appendix A, the two bounds overlap and thus the minimum rate will simply be , which is expected since joint coding cannot bring any gain in this case. However, in the case of correlated sources, we have , and thus it is not clear what the minimum needed rate will be.
Slepian and Wolf showed a surprising result that under mild conditions the minimum required rate is still , that is, separate encoding is as efficient as joint encoding provided that joint decoding takes place. This suggests that joint encoding brings no benefit over separate encoding, and thus opens up the possibility of distributed processing without any loss.
Before formally stating the SW Theorem and its proof, we go back to our temperature sensor example to illustrate the idea of Slepian and Wolf.
Let us consider for simplicity only two sensors with readings and , and suppose that each reading can only take values uniformly between 0 and 31. Hence, the alphabets of and , and , are given by . Thus, each reading can be represented using at least 5 bits. Moreover, let and be correlated in such a way that given , can only take valuesuniformly from . In other words, cannot be lower than 1 degree below and cannot be higher than 2 degrees above .
Let one realization of and be (01011 in binary) and (01100 in binary). Suppose that all five bits of (01011) are sent to the decoder. Thus, the decoder can decode perfectly to .
Scenario 1: Suppose that the transmitters of and can communicate, that is, knows the value of . What is the minimum number of bits it needs to use, so that the decoder can recover without errors? The answer to this question is simply 2, since the encoder can compute the difference between and , which can only take one of four possible values: , and hence it can be encoded with 2 bits. More specifically, suppose we use a code , then the sensor will send 01. The decoder knows that 01 corresponds to , and recovers correctly as . The total number of bits sent by and in this case is 2 + 5 = 7 bits, which is . Indeed, there are a total of possible pairs, which are all equally likely. Note that in this scenario, the encoder exploited the knowledge of !
Scenario 2: Suppose now, that the two encoders cannot communicate. We pose again the same question: What is the minimum number of bits the encoder needs to use, so that the decoder can recover without errors?
The SW Theorem tells us that still only 2 bits are needed. A trick to do this is to partition all the possible outcomes of into subsets or bins. Instead of passing the value of directly, only the index of the bin that lies in is passed to the decoder. If for all , each bin contains only one unique satisfying the correlation constraints, then this unique can be identified given and the bin index at the decoder. This means that perfect decoding is possible. From our example, if , can only be 6, 7, 8, or 9. Therefore, from the above argument, these four consecutive values cannot fall into the same bin to avoid ambiguous element at the decoder. Extending this argument to other values of and will force the difference between any two inside the same bin to be no smaller than 4. This gives us a hint that the elements inside a bin should be as far away as possible. Moreover, if SW coding really has no loss, we expect that only 2 bits are needed to compress and thus bins should be sufficient. Therefore, a good guess for the four bins will be , , , and . This is basically the optimum way in constructing four bins with the elements inside each bin as far apart as possible. The four bins areshown in Figure 2.2.
Figure 2.2 Four bins of a network of temperature sensors.
Following from our previous example, since, bin index 00 will be transmitted from the encoder. Knowing the bin index 00, the decoder knows that is divisible by 4. Since (thus potential ), hence the decoder can conclude that .
The procedure of distributing all possible outcomes of into bins is called binning. Note that the total code rate is again . Thus, was compressed at the rate (indeed, given , there are in total four possible values that is equally likely to take). It is worth noting that each bin contains eight values, which is in fact , information that gives us about . That is, knowing