139,99 €
This book presents tools and algorithms required to compress/uncompress signals such as speech and music. These algorithms are largely used in mobile phones, DVD players, HDTV sets, etc. In a first rather theoretical part, this book presents the standard tools used in compression systems: scalar and vector quantization, predictive quantization, transform quantization, entropy coding. In particular we show the consistency between these different tools. The second part explains how these tools are used in the latest speech and audio coders. The third part gives Matlab programs simulating these coders.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 217
Veröffentlichungsjahr: 2013
Introduction
PART 1. TOOLS FOR SIGNAL COMPRESSION
Chapter 1. Scalar Quantization
1.1. Introduction
1.2. Optimum scalar quantization
1.3. Predictive scalar quantization
Chapter 2. Vector Quantization
2.1. Introduction
2.2. Rationale
2.3. Optimum codebook generation
2.4. Optimum quantizer performance
2.5. Using the quantizer
2.6. Gain-shape vector quantization
Chapter 3. Sub-band Transform Coding
3.1. Introduction
3.2. Equivalence of filter banks and transforms
3.3. Bit allocation
3.4. Optimum transform
3.5. Performance
Chapter 4. Entropy Coding
4.1. Introduction
4.2. Noiseless coding of discrete, memoryless sources
4.3. Noiseless coding of a discrete source with memory
4.4. Scalar quantizer with entropy constraint
4.5. Capacity of a discrete memoryless channel
4.6. Coding a discrete source with a fidelity criterion
PART 2. AUDIO SIGNAL APPLICATIONS
Chapter 5. Introduction to Audio Signals
5.1. Speech signal characteristics
5.2. Characteristics of music signals
5.3. Standards and recommendations
Chapter 6. Speech Coding
6.1. PCM and ADPCM coders
6.2. The 2.4 bit/s LPC-10 coder
6.3. The CELP coder
Chapter 7. Audio Coding
7.1. Principles of “perceptual coders”
7.2. MPEG-1 layer 1 coder
7.3. MPEG-2 AAC coder
7.4. Dolby AC-3 coder
7.5. Psychoacoustic model: calculating a masking threshold
Chapter 8. Audio Coding: Additional Information
8.1. Low bit rate/acceptable quality coders
8.2. High bit rate lossless or almost lossless coders
Chapter 9. Stereo Coding: A Synthetic Presentation
9.1. Basic hypothesis and notation
9.2. Determining the inter-channel indices
9.3. Downmixing procedure
9.4. At the receiver
9.5. Draft International Standard
PART 3. MATLAB® PROGRAMS
Chapter 10. A Speech Coder
10.1. Introduction
10.2. Script for the calling function
10.3. Script for called functions
Chapter 11. A Music Coder
11.1. Introduction
11.2. Script for the calling function
11.3. Script for called functions
Bibliography
Index
First published 2011 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Adapted and updated from Outils pour la compression des signaux: applications aux signaux audioechnologies du stockage d’énergie published 2009 in France by Hermes Science/Lavoisier © Institut Télécom et LAVOISIER 2009
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUK
John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USA
www.iste.co.uk
www.wiley.com
© ISTE Ltd 2011
The rights of Nicolas Moreau to be identified as the author of this work have been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.
Moreau, Nicolas, 1945-
[Outils pour la compression des signaux. English]
Tools for signal compression / Nicolas Moreau.
p. cm.
“Adapted and updated from Outils pour la compression des signaux : applications aux signaux audioechnologies du stockage d’energie.”
Includes bibliographical references and index.
ISBN 978-1-84821-255-8
1. Sound--Recording and reproducing--Digital techniques. 2. Data compression (Telecommunication) 3. Speech processing systems. I. Title.
TK7881.4.M6413 2011
621.389’3--dc22
2011003206
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-84821-255-8
In everyday life, we often come in contact with compressed signals: when using mobile telephones, mp3 players, digital cameras, or DVD players. The signals in each of these applications, telephone-band speech, high fidelity audio signal, and still or video images are not only sampled and quantized to put them into a form suitable for saving in mass storage devices or to send them across networks, but also compressed. The first operation is very basic and is presented in all courses and introductory books on signal processing. The second operation is more specific and is the subject of this book: here, the standard tools for signal compression are presented, followed by examples of how these tools are applied in compressing speech and musical audio signals. In the first part of this book, we focus on a problem which is theoretical in nature: minimizing the mean squared error. The second part is more concrete and qualifies the previous steps in seeking to minimize the bit rate while respecting the psychoacoustic constraints. We will see that signal compression consists of seeking not only to eliminate all redundant parts of the original signal but also to attempt the elimination of inaudible parts of the signal.
The compression techniques presented in this book are not new. They are explained in theoretical framework, information theory, and source coding, aiming to formalize the first (and the last) element in a digital communication channel: the encoding of an analog signal (with continuous times and continuous values) to a digital signal (at discrete times and discrete values). The techniques come from the work by C. Shannon, published at the beginning of the 1950s. However, except for the development of speech encodings in the 1970s to promote an entirely digitally switched telephone network, these techniques really came into use toward the end of the 1980s under the influence of working groups, for example, “Group Special Mobile (GSM)”, “Joint Photographic Experts Group (JPEG)”, and “Moving Picture Experts Group (MPEG)”.
The results of these techniques are quite impressive and have allowed the development of the applications referred to earlier. Let us consider the example of a music signal. We know that a music signal can be reconstructed with quasi-perfect quality (CD quality) if it was sampled at a frequency of 44.1 kHz and quantized at a resolution of 16 bits. When transferred across a network, the required bit rate for a mono channel is 705 kb/s. The most successful audio encoding, MPEG-4 AAC, ensures “transparency” at a bit rate of the order of 64 kb/s, giving a compression rate greater than 10, and the completely new encoding MPEG-4 HE-AACv2, standardized in 2004, provides a very acceptable quality (for video on mobile phones) at 24 kb/s for 2 stereo channels. The compression rate is better than 50!
In the Part 1 of this book, the standard tools (scalar quantization, predictive quantization, vector quantization, transform and sub-band coding, and entropy coding) are presented. To compare the performance of these tools, we use an academic example of the quantization of the realization x(n) of a one-dimensional random process X(n). Although this is a theoretical approach, it not only allows objective assessment of performance but also shows the coherence between all the available tools. In the Part 2, we concentrate on the compression of audio signals (telephone-band speech, wideband speech, and high fidelity audio signals).
Throughout this book, we discuss the basic ideas of signal processing using the following language and notation. We consider a one-dimensional, stationary, zero-mean, random process X(n), with power and power spectral density SX(f). We also assume that it is Gaussian, primarily because the Gaussian distribution is preserved in all linear transformations, especially in a filter which greatly simplifies the notation, and also because a Gaussian signal is the most difficult signal to encode because it carries the greatest quantization error for any bit rate. A column vector of N dimensions is denoted by and constructed with X(mN) … X(mN + N − 1). These N random variables are completely defined statistically by their probability density function:
where is the autocovariance matrix:
Toeplitz matrix with N × N dimensions. Moreover, we assume an auto-regressive process X(n) of order P, obtained through filtering with white noise W(n) with variance via a filter of order P with a transfer function 1/A(z) for A(z) in the form:
The purpose of considering the quantization of an auto-regressive waveform as our example is that it allows the simple explanation of all the statistical characteristics of the source waveform as a function of the parameters of the filter such as, for example, the power spectral density:
where the notation A(f) is inaccurate and should be more properly written as A(exp(j2πf)). It also allows us to give analytical expressions for the quantization error power for different quantization methods when quadratic error is chosen as the measure of distortion. Comparison of the performance of the different methods is thereby possible. From a practical point of view, this example is not useless because it is a reasonable model for a number of signals, for example, for speech signals (which are only locally stationary) when the order P selected is high enough (e.g. 8 or 10).
Let us consider a discrete-time signal x(n) with values in the range [A, +A]. Defining a scalar quantization with a resolution of b bits per sample requires three operations:
numbering the partitioned intervals {i1 iL},
selecting the reproduction value for each interval, the set of these reproduction values forms a dictionary (codebook)1.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
