80,99 €
This book presents recent advances in DSP to simplify, or increase the computational speed of, common signal processing operations. The topics describe clever DSP tricks of the trade not covered in conventional DSP textbooks. This material is practical, real-world, DSP tips and tricks as opposed to the traditional highly-specialized, math-intensive, research subjects directed at industry researchers and university professors. This book goes well beyond the standard DSP fundamentals textbook and presents new, but tried-and-true, clever implementations of digital filter design, spectrum analysis, signal generation, high-speed function approximation, and various other DSP functions.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 647
Veröffentlichungsjahr: 2012
Table of Contents
Cover
IEEE Press
Title page
Copyright page
Dedication
Preface
Contributors
Part One: Efficient Digital Filters
Chapter 1 Lost Knowledge Refound: Sharpened FIR Filters
1.1 IMPROVING A DIGITAL FILTER
1.2 FIR FILTER SHARPENING
1.3 IMPLEMENTATION ISSUES
1.4 CONCLUSIONS
EDITOR COMMENTS
Chapter 2 Quantized FIR Filter Design Using Compensating Zeros
2.1 QUANTIZED FILTER DESIGN FIGURES OF MERIT
2.2 FILTER STRUCTURES
2.3 EXAMPLE 1: A WINDOWED FIR FILTER
2.4 SIMPLE QUANTIZATION
2.5 COMPENSATING ZEROS QUANTIZATION
2.6 EXAMPLE 2: A BIORTHOGONAL FIR FILTER
2.7 DESIGN RULES-OF-THUMB
2.8 CONCLUSIONS
Chapter 3 Designing Nonstandard Filters with Differential Evolution
3.1 RECASTING FILTER DESIGN AS A MINIMIZATION PROBLEM
3.2 MINIMIZATION WITH DIFFERENTIAL EVOLUTION
3.3 A DIFFERENTIAL EVOLUTION DESIGN EXAMPLE
3.4 CONCLUSIONS
Chapter 4 Designing IIR Filters with a Given 3 dB Point
4.1 LOWPASS CHEBYSHEV FILTERS
4.2 DETERMINING THE 3 DB POINT OF AN ANALOG CHEBYSHEV TYPE I FILTER
4.3 DESIGNING THE LOWPASS FILTER
4.4 CHEBYSHEV TYPE II FILTERS
4.5 HIGHPASS, BANDPASS, AND BANDSTOP CHEBYSHEV FILTERS
4.6 ELLIPTIC FILTERS
4.7 CONCLUSIONS
Chapter 5 Filtering Tricks for FSK Demodulation
5.1 A SIMPLE WIRELESS RECEIVER
5.2 USING A COMB FILTER
5.3 BENEFICIAL COMB FILTER BEHAVIOR
5.4 CONCLUSIONS
EDITOR COMMENTS
Chapter 6 Reducing CIC Filter Complexity
6.1 REDUCING INTERPOLATION FILTER COMPLEXITY
6.2 AN EFFICIENT LINEAR INTERPOLATOR
6.3 NONRECURSIVE CIC DECIMATION FILTERS
6.4 CONCLUSIONS
Chapter 7 Precise Filter Design
7.1 PROBLEM BACKGROUND
7.2 TEXTBOOK APPROXIMATIONS
7.3 AN APPROXIMATION THAT YOU WON’T FIND IN ANY TEXTBOOK
7.4 LEAST SQUARES SOLUTION
7.5 IMPLEMENTATION NOTES
7.6 CONCLUSIONS
EDITOR COMMENTS
Chapter 8 Turbocharging Interpolated FIR Filters
8.1 TRADITIONAL IFIR FILTERS
8.2 TWO-STAGE MASKING IFIR FILTERS
8.3 RECURSIVE IFIR FILTERS
8.4 MASKING ATTENUATION IMPROVEMENT
8.5 RECURSIVE IFIR FILTER DESIGN EXAMPLE
8.6 RECURSIVE IFIR FILTER IMPLEMENTATION ISSUES
8.7 RECURSIVE IFIR FILTER DESIGN
8.8 RECURSIVE IFIR FILTERS FOR DECIMATION
8.9 AN ALTERNATE MASKING SUBFILTER
Chapter 9 A Most Efficient Digital Filter: The Two-Path Recursive All-Pass Filter
9.1 ALL-PASS NETWORK BACKGROUND
9.2 IMPLEMENTING AND COMBINING ALL-PASS NETWORKS
9.3 TWO-PATH FILTERS
9.4 LINEAR PHASE TWO-PATH HALFBAND FILTERS
9.5 PASSBAND RESPONSE IN TWO-PATH HALFBAND FILTERS
9.6 TRANSFORMING THE PROTOTYPE HALFBAND TO ARBITRARY BANDWIDTH
9.7 LOWPASS TO LOWPASS TRANSFORMATION
9.8 LOWPASS-TO-BANDPASS TRANSFORMATION
9.9 CONCLUSIONS
EDITOR COMMENTS
Chapter 10 DC Blocker Algorithms
10.1 FIXED-POINT DC BLOCKER
10.2 FIXED-POINT IMPLEMENTATION
10.3 LINEAR-PHASE DC BLOCKER
10.4 CONCLUSIONS
Chapter 11 Precise Variable-Q Filter Design
11.1 A SIMPLE VARIABLE-Q FILTER
11.2 THE PROBLEM
11.3 THE SOLUTION
11.4 CONCLUSIONS
EDITOR COMMENTS
Chapter 12 Improved Narrowband Lowpass IIR Filters in Fixed-Point Systems
12.1 MOTIVATION
12.2 THE PROBLEM
12.3 AN IMPROVED NARROWBAND LOWPASS IIR FILTER
12.4 INTERPOLATED-IIR FILTER EXAMPLE
12.5 CONCLUSION
Chapter 13 Improving FIR Filter Coefficient Precision
13.1 TRADITIONAL FIR FILTERING
13.2 SERIAL IMPLEMENTATION
13.3 SERIAL METHOD EXAMPLE
13.4 PARALLEL IMPLEMENTATION
13.5 PARALLEL METHOD EXAMPLE
13.6 COMPUTING COEFFICIENT BITWIDTHS
13.7 SERIAL METHOD COEFFICIENT QUANTIZATION
13.8 PARALLEL METHOD COEFFICIENT QUANTIZATION
13.9 CONCLUSIONS
Part Two: Signal and Spectrum Analysis Tricks
Chapter 14 Fast, Accurate Frequency Estimators
14.1 SPECTRAL PEAK LOCATION ALGORITHMS
14.2 ALGORITHM PERFORMANCE
14.3 CONCLUSIONS
EDITOR COMMENTS
Chapter 15 Fast Algorithms for Computing Similarity Measures in Signals
15.1 SIMILARITY MEASURE COMPONENTS
15.2 SIMILARITY MEASURES
15.3 FAST CALCULATION OF SIMILARITY MEASURE COMPONENTS
15.4 RESULTS
15.5 CONCLUSIONS
Chapter 16 Efficient Multi-tone Detection
16.1 MULTI-TONE DETECTION
16.2 COMPLEX DOWNCONVERSION
16.3 LOWPASS FILTERING
16.4 MAGNITUDE APPROXIMATION
16.5 VALIDITY CHECKING
16.6 IMPLEMENTATION ISSUES
16.7 OTHER CONSIDERATIONS
EDITOR COMMENTS
Chapter 17 Turning Overlap-Save into a Multiband, Mixing, Downsampling Filter Bank
17.1 SOMETHING OLD AND SOMETHING NEW
17.2 REVIEW OF FAST CONVOLUTION
17.3 CHOOSING FFT SIZE: COMPLEXITY IS RELATED TO FILTER LENGTH AND OVERLAP FACTOR
17.4 FILTERING MULTIPLE CHANNELS: REUSE THE FORWARD FFT
17.5 FREQUENCY DOMAIN DOWNSAMPLING: ALIASING IS YOUR FRIEND
17.6 MIXING AND OS FILTERING: ROTATE THE FREQUENCY DOMAIN FOR COARSE MIXING
17.7 PUTTING IT ALL TOGETHER
17.8 FOOD FOR THOUGHT
17.9 CONCLUSIONS
EDITOR COMMENTS
Chapter 18 Sliding Spectrum Analysis
18.1 GOERTZEL ALGORITHM
18.2 SLIDING DISCRETE FOURIER TRANSFORM (SDFT)
18.3 SDFT STABILITY
18.4 TIME-DOMAIN WINDOWING IN THE FREQUENCY DOMAIN
18.5 SLIDING GOERTZEL DFT
18.6 CONCLUSIONS
18.8 APPENDIX
EDITOR COMMENT
Chapter 19 Recovering Periodically Spaced Missing Samples
19.1 MISSING SAMPLES RECOVERY PROCESS
19.2 RECOVERY PROCESS
19.3 COMPUTING FILTER COEFFICIENTS
19.4 DISCUSSION
19.5 CONCLUSIONS
EDITOR COMMENTS
Chapter 20 Novel Adaptive IIR Filter for Frequency Estimation and Tracking
20.1 HARMONIC NOTCH FILTER STRUCTURE
20.2 ALGORITHM DEVELOPMENT
20.3 ALGORITHM STEPS
20.4 COMPUTER SIMULATIONS
20.5 TYPICAL ALGORITHM PERFORMANCE
20.6 CONCLUSIONS
EDITOR COMMENTS
Chapter 21 Accurate, Guaranteed-Stable, Sliding DFT
21.1 SLIDING DFT
21.2 MODULATED SDFT
21.3 MODULATING SEQUENCE
21.4 OTHER STABLE SDFT ALGORITHMS
21.5 NUMERICAL SIMULATIONS
21.6 COMPUTATIONAL WORKLOAD
21.7 CONCLUSIONS
Chapter 22 Reducing FFT Scalloping Loss Errors without Multiplication
22.1 FFT SCALLOPING LOSS REVISITED
22.2 FREQUENCY-DOMAIN CONVOLUTION
22.3 IMPROVED CONVOLUTION COEFFICIENTS
22.4 FURTHER COMPUTATIONAL IMPROVEMENTS
22.5 IMPLEMENTATION CONSIDERATIONS
22.6 CONCLUSION
EDITOR COMMENTS
Chapter 23 Slope Filtering: An FIR Approach to Linear Regression
23.1 MOTIVATION FOR SLOPE FILTERING
23.2 LINEAR REGRESSION
23.3 APPLICATION: RECEIVER CARRIER RECOVERY
23.4 APPLICATION: SIGNAL RATE OF CHANGE ESTIMATION
23.5 APPLICATION: SIGNAL TRANSITION DETECTION
23.6 APPLICATION: SIGNAL TRANSITION-POLARITY DETECTION
23.7 CONCLUSIONS
EDITOR COMMENTS
Part Three: Fast Function Approximation Algorithms
Chapter 24 Another Contender in the Arctangent Race
24.1 COMPUTING ARCTANGENTS
24.2 CONCLUSIONS
Chapter 25 High-Speed Square Root Algorithms
25.1 NEWTON-RAPHSON INVERSE (NRI) METHOD
25.2 NONLINEAR IIR FILTER (NIIRF) METHOD
25.3 BINARY-SHIFT MAGNITUDE ESTIMATION
25.4 EQUIRIPPLE-ERROR MAGNITUDE ESTIMATION
25.5 CONCLUSIONS
EDITOR COMMENTS
Chapter 26 Function Approximation Using Polynomials
26.1 USING LAGRANGE INTERPOLATION
26.2 FINDING THE OPTIMAL APPROXIMATION POLYNOMIAL
26.3 RANGE REDUCTION
26.4 SUBINTERVAL DIVISION
26.5 PRACTICAL CONSIDERATIONS
26.6 ERROR STUDIES
26.7 FUNCTION APPROXIMATION EXAMPLE
26.8 CONCLUSIONS
EDITOR COMMENTS
Chapter 27 Efficient Approximations for the Arctangent Function
27.1 ARCTANGENT APPROXIMATIONS
27.2 PROPOSED METHODOLOGY AND APPROXIMATIONS
27.3 DISCUSSION
27.4 FOUR-QUADRANT APPROXIMATIONS
27.5 CONCLUSIONS
EDITOR COMMENTS
Chapter 28 A Differentiator with a Difference
28.1 DISCRETE-TIME DIFFERENTIATION
28.2 AN IMPROVED DIFFERENTIATOR
Chapter 29 A Fast Binary Logarithm Algorithm
29.1 BACKGROUND
29.2 THE LOGARITHM ALGORITHM
Chapter 30 Multiplier-Free Divide, Square Root, and Log Algorithms
30.1 COMPUTING THE RATIO OF TWO REAL NUMBERS
30.2 SOLVING A LINEAR SYSTEM RXB
30.3 COMPUTING THE RATIO OF TWO COMPLEX NUMBERS
30.4 COMPUTING SQUARE ROOTS
30.5 COMPUTING LOGARITHMS
30.6 THE LOG ALGORITHM
30.7 COMPUTING THE KI FACTORS
30.8 CONCLUSION
Chapter 31 A Simple Algorithm for Fitting a Gaussian Function
31.1 GAUSSIAN CURVE FITTING
31.2 CARUANA’S ALGORITHM
31.3 EFFECTS OF NOISE
31.4 WEIGHTED LEAST SQUARES ESTIMATION
31.5 ITERATIVE PROCEDURE
31.6 CONCLUSIONS
Chapter 32 Fixed-Point Square Roots Using L-Bit Truncation
32.1 COMPUTING SQUARE ROOTS
32.2 DIRECT NEWTON-RAPHSON (DNR) METHOD
32.3 NLIIRF METHOD
32.4 PROPOSED DNRT(N) METHOD
32.5 A DNRT(2) METHOD
32.6 ENHANCED PRECISION LUTS
32.7 PROCESSING RESULTS COMPARISON
32.8 CONCLUSION
APPENDIX
Part Four: Signal Generation Techniques
Chapter 33 Recursive Discrete-Time Sinusoidal Oscillators
33.1 MATRIX DERIVATION OF OSCILLATOR PROPERTIES
33.2 INTERPRETING THE ROTATION MATRIX
33.3 CATALOG OF OSCILLATORS
33.4 BIQUAD
33.5 DIGITAL WAVEGUIDE
33.6 EQUI-AMPLITUDE STAGGERED UPDATE
33.7 QUADRATURE STAGGERED UPDATE
33.8 COUPLED STANDARD QUADRATURE
33.9 DYNAMIC AMPLITUDE CONTROL
33.10 DYNAMIC FREQUENCY CONTROL
33.11 FSK MODULATOR DESIGN EXAMPLE
33.12 CONCLUSIONS
EDITOR COMMENTS
Chapter 34 Direct Digital Synthesis: A Tool for Periodic Wave Generation
34.1 DIRECT DIGITAL SYNTHESIS: AN OVERVIEW
34.2 THEORY OF OPERATION AND IMPLEMENTATION
34.3 QUANTIZATION EFFECTS
34.4 IMPROVING SFDR BY SINEWAVE COMPRESSION
34.5 IMPROVING SFDR THROUGH SPUR REDUCTION TECHNIQUES
34.6 CONCLUSIONS
Chapter 35 Implementing a ΣΔ DAC in Fixed-Point Arithmetic
35.1 THE ΣΔ DAC PROCESS
35.2 ΣΔ DAC PERFORMANCE
35.3 IMPLEMENTATION
35.4 CONCLUSIONS
Chapter 36 Efficient 8-PSK/16-PSK Generation Using Distributed Arithmetic
36.1 BACKGROUND
36.2 GENERATION OF 8-PSK AND 16-PSK MODULATIONS
36.3 OPTIMIZED GENERATION OF 8-PSK MODULATIONS
36.4 OPTIMIZED GENERATION OF 16-PSK MODULATIONS
36.5 PSEUDOSYMBOL GENERATION
36.6 COMPLEXITY EVALUATION
36.7 CONCLUSIONS
Chapter 37 Ultra-Low-Phase Noise DSP Oscillator
37.1 OSCILLATOR OUTPUT SEQUENCE GENERATION
37.2 CORDIC ALGORITHM-BASED PROCESSING
37.3 LOW-PHASE NOISE CORDIC OSCILLATOR DESIGN
37.4 OUTPUT STABILIZATION
37.5 EXAMPLE
37.6 IMPLEMENTATION
37.7 CONCLUSIONS
Chapter 38 An Efficient Analytic Signal Generator
38.1 ANALYTIC SIGNAL GENERATION APPROACH
38.2 DEFINING THE FREQUENCY RESPONSE
38.3 IMPULSE RESPONSE DERIVATIONS
38.4 ROTATING FILTER IMPULSE RESPONSES
38.5 COMPUTING FIR FILTER COEFFICIENTS
38.6 CONCLUSION
Part Five: Assorted High-Performance DSP Techniques
Chapter 39 Frequency Response Compensation with DSP
39.1 FILTER TABLE
39.2 RUN-TIME FILTER DESIGN
39.3 CALIBRATION TABLES
39.4 FIR VERSUS IIR FILTERS
39.5 LINEAR PHASE FIR FILTERS
39.6 SAMPLING AND SIGNAL FREQUENCY
39.7 FILTER DESIGN
39.8 MATLAB SIMULATION
39.9 IMPLEMENTATION IN C
39.10 EXTENSIONS
39.11 CALIBRATION TABLES
Chapter 40 Generating Rectangular Coordinates in Polar Coordinate Order
40.1 CENTERED GRID ALGORITHM
40.2 NONCENTERED GRID
40.3 TRIANGULAR GRID
40.4 CONCLUSIONS
Chapter 41 The Swiss Army Knife of Digital Networks
41.1 GENERAL FUNCTIONS
41.2 ANALYSIS AND SYNTHESIS FUNCTIONS
41.3 FILTER FUNCTIONS
41.4 ADDITIONAL FILTER FUNCTIONS
EDITOR COMMENTS
Chapter 42 JPEG2000–Choices and Trade-offs for Encoders
42.1 JPEG2000 STANDARD
42.2 TILE SIZE
42.3 COLOR SPACE
42.4 NUMBER OF WAVELET TRANSFORM LEVELS
42.5 CODE-BLOCK SIZE
42.6 COEFFICIENT QUANTIZATION STEP SIZE
42.7 REGION OF INTEREST CODING METHOD
Chapter 43 Using Shift Register Sequences
43.1 COMPLETE SET OF POSSIBLE STATES
43.2 USE FOR TESTING LOGIC CIRCUITS
43.3 CIRCULAR BUFFER
43.4 SPECTRAL PROPERTIES
43.5 RANDOM NUMBER GENERATORS
43.6 CONCLUSIONS
Chapter 44 Efficient Resampling Implementations
44.1 RESAMPLING USING POLYPHASE FILTERS
44.2 RESAMPLING BY AN INTEGER FACTOR
44.3 RESAMPLING BY AN ARBITRARY FACTOR
44.4 INTERPOLATION BY AN ARBITRARY FACTOR
44.5 COMPUTATION OF THE INTERPOLATION COEFFICIENTS
44.6 DECIMATION BY AN ARBITRARY FACTOR
44.7 COMPUTATION OF THE DECIMATION COEFFICIENTS
44.8 ALTERNATE METHOD FOR COMPUTING DECIMATION COEFFICIENTS
44.9 FPGA IMPLEMENTATION ISSUES
44.10 CONCLUSIONS
EDITOR COMMENT
Chapter 45 Sampling Rate Conversion in the Frequency Domain
45.1 SRC BASICS
45.2 REQUIREMENTS IN THE FREQUENCY DOMAIN
45.3 SAMPLE RATE DECREASE (D > I)
45.4 SAMPLE RATE INCREASE (D < I)
45.5 OVERLAP APPROACH FOR LONG SEQUENCES
45.6 MEAN SQUARED ERRORS
45.7 COMPUTATIONAL COMPLEXITY
45.8 CONCLUSIONS
Chapter 46 Enhanced-Convergence Normalized LMS Algorithm
46.1 BACKGROUND
46.2 EXAMPLES
Index
IEEE Press
445 Hoes Lane
Piscataway, NJ 08854
IEEE Press Editorial Board
Lajos Hanzo, Editor in Chief
R. Abhari
M. El-Hawary
O. P. Malik
J. Anderson
B-M. Haemmerli
S. Nahavandi
G. W. Arnold
M. Lanzerotti
T. Samad
F. Canavero
D. Jacobson
G. Zobrist
Kenneth Moore, Director of IEEE Book and Information Services (BIS)
Copyright © 2012 by the Institute of Electrical and Electronics Engineers.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey. All rights reserved.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Streamlining digital signal processing : a tricks of the trade guidebook / edited by Richard G. Lyons. – 2nd ed.
p. cm.
ISBN 978-1-118-27838-3 (pbk.)
1. Signal processing–Digital techniques. I. Lyons, Richard G., 1948–
TK5102.9.S793 2012
621.382'2–dc23
2011047633
This book is dedicated to all the signal processing engineers who struggle to learn their craft and willingly share that knowledge with their engineering brethren—people of whom the English poet Chaucer would say, “Gladly would he learn and gladly teach.”
Preface
An updated and expanded version of its first edition, this book presents recent advances in digital signal processing (DSP) to simplify, or increase the computational speed of, common signal processing operations. The topics here describe clever DSP tricks of the trade not covered in conventional DSP textbooks. This material is practical, real-world DSP tips and tricks as opposed to the traditional highly specialized, math-intensive research subjects directed at industry researchers and university professors. Here we go beyond the standard DSP fundamentals textbook and present new, but tried-n-true, clever implementations of digital filter design, spectrum analysis, signal generation, high-speed function approximation, and various other DSP functions.
Our goal in this book is to create a resource that is relevant to the needs of the working DSP engineer by helping bridge the theory-to-practice gap between introductory DSP textbooks and the esoteric, difficult-to-understand academic journals. We hope the material in this book makes the practicing DSP engineer say, “Wow that’s pretty neat—I have to remember this, maybe I can use it sometime.” While this book will be useful to experienced DSP engineers, due to its gentle tutorial style it will also be of considerable value to the DSP beginner.
The mathematics used here is simple algebra and the arithmetic of complex numbers, making this material accessible to a wide engineering and scientific audience. In addition, each chapter contains a reference list for those readers wishing to learn more about a given DSP topic. The chapter topics in this book are written in a stand-alone manner, so the subject matter can be read in any desired order.
The contributors to this book make up a dream team of experienced DSP engineer-authors. They are not only knowledgeable in signal processing theory, they are “make it work” engineers who build working DSP systems. (They actually know which end of the soldering iron is hot.) Unlike many authors whose writing seems to say, “I understand this topic and I defy you to understand it,” our contributors go all-out to convey as much DSP understanding as possible. As such the chapters of this book are postcards from our skilled contributors on their endless quest for signal processing’s holy grail: accurate processing results at the price of a bare minimum of computations.
Software simulation code, in the C-language, MATLAB®, and MathCAD® is available for some of the material in this book. The Internet website URL address for such software is http://booksupport.wiley.com.
We welcome you to this DSP tricks of the trade guidebook. I, the IEEE Press, and John Wiley & Sons hope you find it valuable.
RICHARD G. LYONSE-mail: [email protected]
“If you really wish to learn then you must mount the machine and become acquainted with its tricks by actual trial.”
—Wilbur Wright, co-inventor of the first successful airplane, 1867–1912
Contributors
Mark AllieUniversity of Wisconsin–MadisonMadison, Wisconsin
François AugerIREENA, University of NantesFrance
Andor BariskaInstitute of Data Analysis and Process DesignZurich University of Applied SciencesWinterthur, Switzerland
Douglas W. BarkerITT CorporationWhite Plains, New York
Amy BellInstitute for Defense AnalysesAlexandria, Virginia
Greg BerchinConsultant in Signal ProcessingColchester, Vermont
Guoan BiNanyang Technological UniversitySingapore
Mark Borgerding3dB Labs, Inc.Cincinnati, Ohio
Joan CarlettaUniversity of AkronAkron, Ohio
Lionel CordessesTechnocentre, RenaultGuyancourt, France
Matthew DonadioNight Kitchen InteractivePhiladelphia, Pennsylvania
Krzysztof DudaDepartment of Measurement and InstrumentationAGH University of Science and TechnologyKrakow, Poland
Shlomo EngelbergJerusalem College of TechnologyJerusalem, Israel
Huei-Wen FerngNational Taiwan University of Science and TechnologyTaipei, Taiwan, R. O. C.
Bruno FeuvrieIREENA, University of NantesFrance
Woon-Seng GanNanyang Technological UniversitySingapore
Maurice GivensGas Technology InstituteDes Plaines, Illinois
Hongwei GuoShanghai UniversityShanghai, R. O. C.
Fred HarrisSan Diego State UniversitySan Diego, California
Laszlo HarsSeagate ResearchPittsburgh, Pennsylvania
Robert InkolDefence Research and DevelopmentOttawa, Canada
Eric JacobsenAnchor Hill CommunicationsScottsdale, Arizona
Jean JiangCollege of Engineering and TechnologyPurdue University North CentralWestville, Indiana
Alain JoyalDefence Research and DevelopmentOttawa, Canada
Peter KootsookosEmuse Technologies Ltd.Dublin, Ireland
Kishore KotteriMicrosoft Corp.Redmond, Washington
Feng LiIREENA, University of NantesFrance
Ricardo A. LosadaThe MathWorks, Inc.Natick, Massachusetts
Zhen LuoGuangdong University of TechnologyGuangzhou, R. O. C.
Richard LyonsBesser AssociatesMountain View, California
James McNamesPortland State UniversityPortland, Oregon
Sanjit K. MitraUniversity of Southern CaliforniaSanta Barbara, California
Vincent PellissierThe MathWorks, Inc.Natick, Massachusetts
Charles RaderRetired, formerly with MIT Lincoln LaboratoryLexington, Massachusetts
Sreeraman RajanDefence Research and DevelopmentOttawa, Canada
Josep SalaTechnical University of CataloniaBarcelona, Spain
Abhishek SethSynopsysKarnataka, India
Zhi ShenHuazhong University of Science and TechnologyHubei, R.O.C.
David ShiungMediaTek Inc.Hsin-chu, Taiwan, R.O.C.
Rainer StornRohde & Schwarz GmbH & Co. KGMunich, Germany
Li TanCollege of Engineering and TechnologyPurdue University North CentralWestville, Indiana
Clay S. TurnerPace-O-Matic, Inc.Atlanta, Georgia
Krishnaraj VarmaHughes Network SystemsGermantown, Maryland
Vladimir VassilevskyAbvolt Ltd.Perry, Oklahoma
Sichun WangDefence Research and DevelopmentOttawa, Canada
Randy YatesDigital Signal Labs Inc.Fuquay-Varina, North Carolina
Jyri YlöstaloNokia Siemens NetworksHelsinki, Finland
Part OneEfficient Digital Filters
Chapter 1
Lost Knowledge Refound: Sharpened FIR Filters
Matthew Donadio
Night Kitchen Interactive
What would you do in the following situation? Let’s say you are diagnosing a DSP system problem in the field. You have your trusty laptop with your development system and an emulator. You figure out that there was a problem with the system specifications and a symmetric FIR filter in the software won’t do the job; it needs reduced passband ripple or, maybe, more stopband attenuation. You then realize you don’t have any filter design software on the laptop, and the customer is getting angry. The answer is easy: you can take the existing filter and sharpen it. Simply stated, filter sharpening is a technique for creating a new filter from an old one [1]–[3]. While the technique is almost 30 years old, it is not generally known by DSP engineers nor is it mentioned in most DSP textbooks.
Before we look at filter sharpening, let’s consider the first solution that comes to mind, filtering the data twice with the existing filter. If the original filter’s transfer function is H(z), then the new transfer function (of the H(z) filter cascaded with itself) is H(z)2. For example, let’s assume the original lowpass N-tap FIR filter, designed using the Parks-McClellan algorithm [4], has the following characteristics:
Figure 1–1(a) shows the performance of the H(z) and cascaded H(z)2 filters. Everything looks okay. The new filter has the same band edges, and the stopband attenuation is increased. But what about the passband? Let’s zoom in and take a look at Figure 1–1(b). The squared filter, H(z)2, has larger deviations in the passband than the original filter. In general, the squaring process will:
Figure 1–1 H(z) and H(z)2 performance: (a) full frequency response; (b) passband response.
It is fairly easy to examine this operation to see the observed behavior if we view the relationship between H(z) and H(z)2 in a slightly unconventional way. We can think of filter squaring as a function F[H(z)] operating on the H(z) transfer function. We can then plot the output amplitude of this function, H(z)2, versus the amplitude of the input H(z) to visualize the amplitude change function.
The plot for F[H(z)] = H(z) is simple; the output is the input, so the result is the straight line as shown in Figure 1–2. The function F[H(z)] = H(z)2 is a quadratic curve. When the H(z) input amplitude is near zero, the H(z)2 output amplitude is closer to zero, which means the stopband attenuation is increased with H(z)2. When the H(z) input amplitude is near one, the H(z)2 output band is approximately twice as far away from one, which means the passband ripple is increased.
Figure 1–2 Various F[H(z)] functions operating on H(z).
The squaring process improved the stopband, but degraded the passband. The improvement was a result of the amplitude change function being horizontal at zero. So to improve H(z) in both the passband and stopband, we want the F[H(z)] amplitude function to be horizontal at both H(z) = 0 and H(z) = 1 (in other words, have a first derivative of zero at these points). This results in the output amplitude changing slower than the input amplitude as we move away from zero and one, which lowers the ripple in these areas. The simplest function that meets this will be a cubic of the form
(1–1)
Its derivative (with respect to x) is
(1–2)
Specifying F(x) and F′(x) for the two values of x = 0 and x = 1 allows us to solve (1–1) and (1–2) for the cn coefficients as
(1–3)
(1–4)
(1–5)
(1–6)
Solving (1–5) and (1–6) simultaneously yields c2 = 3 and c3 = –2, giving us the function
(1–7)
Stating this function as the sharpened filter Hs(z) in terms of H(z), we have
(1–8)
The function Hs(z) is the dotted curve in Figure 1–2.
Hs(z) is called the “sharpened” version of H(z). If we have a function whose z-transform is H(z), then we can outline the filter sharpening procedure, with the aid of Figure 1–3, as the following:
Figure 1–3 Filter sharpening process.
Using the sharpening process results in the improved Hs(z) filter performance shown in Figure 1–4, where we see the increased stopband attenuation and reduced passband ripple beyond that afforded by the original H(z) filter.
Figure 1–4 H(z) and Hs(z) performance: (a) full frequency response; (b) passband response.
It’s interesting to notice that Hs(z) has the same half-amplitude frequency (−6 dB point) as H(z). This condition is not peculiar to the specific filter sharpening example used here—it’s true for all Hs(z)s implemented as in Figure 1–3. This characteristic, useful if we’re sharpening a halfband FIR filter, makes sense if we substitute 0.5 for H(z) in (1–8), yielding Hs(z) = 0.5.
The filter sharpening procedure is very easy to perform, and is applicable to a broad class of FIR filters; including lowpass, bandpass, and highpass FIR filters having symmetrical coefficients and even-order (an odd number of taps). Even multi-passband FIR filters, under the restriction that all passband gains are equal, can be sharpened.
From an implementation standpoint, to correctly implement the sharpening process in Figure 1–3 we must delay the 3x(n) sequence by the group delay, (N − 1)/2 samples, inherent in H(z). In other words, we must time-align 3x(n) and w(n). This is analogous to the need to delay the real path in a practical Hilbert transformer. Because of this time-alignment constraint, filter sharpening is not applicable to filters having nonconstant group delay, such as minimum phase FIR filters or infinite impulse response (IIR) filters. In addition, filter sharpening is inappropriate for Hilbert transformer, differentiating FIR filters, and filters with shaped bands such as sinc compensated filters and raised cosine filters, because cascading such filters corrupts their fundamental properties.
If the original H(z) FIR filter has a nonunity passband gain, the derivation of (1–8) can be modified to account for a passband gain G, leading to a “sharpening” polynomial of
(1–9)
Notice when G = 1, Hs,gain>1(z) in (1–9) is equal to our Hs(z) in (1–8).
We’ve presented a simple method for transforming a FIR filter into one with better passband and stopband characteristics, while maintaining phase linearity. While filter sharpening may not be used often, it does have its place in an engineer’s toolbox. An optimal (Parks-McClellan-designed) filter will have a shorter impulse response than a sharpened filter with the same passband and stopband ripple, and thus be more computationally efficient. However, filter sharpening can be used whenever a given filter response cannot be modified, such as software code that makes use of an unchangeable filter subroutine. The scenario we described was hypothetical, but all practicing engineers have been in situations in the field where a problem needs to be solved without the full arsenal of normal design tools. Filter sharpening could be used when improved filtering is needed but insufficient ROM space is available to store more filter coefficients, or as a way to reduce ROM requirements. In addition, in some hardware filter applications using application-specific integrated circuits (ASICs), it may be easier to add additional chips to a filter design than it is to design a new ASIC.
1.5 REFERENCES
[1] J. Kaiser and R. Hamming, “Sharpening the Response of a Symmetric Nonrecursive Filter by Multiple Use of the Same Filter,” IEEE Trans. Acoustics, Speech, Signal Proc., vol. ASSP–25, no. 5, 1977, pp. 415–422.
[2] R. Hamming, Digital Filters, Prentice Hall, Englewood Cliffs, NJ, 1977, pp. 112–117.
[3] R. Hamming, Digital Filters, 3rd ed., Dover, Mineola, NY, 1998, pp. 140–145.
[4] T. Parks and J. McClellan, “A Program for the Design of Linear Phase Finite Impulse Response Digital Filters,” IEEE Trans. Audio Electroacoust., vol. AU–20, August 1972, pp. 195–199.
___________________________________________________________
___________________________________________________________
When H(z) is a unity-gain filter we can eliminate the multipliers shown in Figure 1–3. The multiply-by-two operation can be implemented with an arithmetic left-shift by one binary bit. The multiply-by-three operation can be implemented by adding a binary signal sample to a shifted-left-by-one-bit version of itself.
To further explain the significance of (1–9), the derivation of (1–8) was based on the assumption that the original H(z) filter to be sharpened had a passband gain of one. If the original filter has a nonunity passband gain of G, then (1–8) will not provide proper sharpening; in that case (1–9) must be used as shown in Figure 1–5. In that figure we’ve included a Delay element, whose length in samples is equal to the group delay of H(z), needed for real-time signal synchronization.
Figure 1–5 Nonunity gain filter sharpening.
It is important to realize that the 3/G and 2/G2 scaling factors in Figure 1–5 provide optimum filter sharpening. However, those scaling factors can be modified to some extent if doing so simplifies the filter implementation. For example, if 2/G2 = 1.8, for ease of implementation, the practitioner should try using a scaling factor of 2 in place of 1.8 because multiplication by 2 can be implemented by a simple binary left-shift by one bit. Using a scaling factor of 2 will not be optimum but it may well be acceptable, depending on the characteristics of the filter to be sharpened. Software modeling will resolve this issue.
As a historical aside, filter sharpening is a process refined and expanded by the accomplished R. Hamming (of Hamming window fame) based on an idea originally proposed by the great American mathematician John Tukey, the inventor of the radix-2 fast Fourier transform (FFT).
Chapter 2
Quantized FIR Filter Design Using Compensating Zeros
Amy Bell
Institute for Defense Analyses
Joan Carletta
University of Akron
Kishore Kotteri
Microsoft Corp.
This chapter presents a design method for translating a finite impulse response (FIR) floating-point filter design into an FIR fixed-point multiplierless filter design. This method is simple, fast, and provides filters with high performance. Conventional wisdom dictates that finite word-length (i.e., quantization) effects can be minimized by dividing a filter into smaller, cascaded sections. The design method presented here takes this idea a step further by showing how to quantize the cascaded sections so that the finite word-length effects in one section are guaranteed to compensate for the finite word-length effects in the other section. This simple method, called compensating zeros, ensures that: (1) the quantized filter’s frequency response closely matches the unquantized filter’s frequency response (in both magnitude and phase); and (2) the required hardware remains small and fast.
Digital filter design typically begins with a technique to find double-precision, floating-point filter coefficients that meet some given performance specifications—like the magnitude response and phase response of the filter. Two well-known techniques for designing floating-point, FIR filters are the windowing method and the equiripple Parks-McClellan method [1], [2]. If the filter design is for a real-time application, then the filter must be translated to fixed-point, a more restrictive form of mathematics that can be performed much more quickly in hardware. For embedded systems applications, a multiplierless implementation of a filter is advantageous; it replaces multiplications with faster, cheaper shifts and additions. Translation to a fixed-point, multiplierless implementation involves quantizing the original filter coefficients (i.e., approximating them using fixed-point mathematics). The primary difficulty with real-time implementations is that this translation alters the original design; consequently, the desired filter’s frequency response characteristics are often not preserved.
Multiplierless filter design can be posed as an optimization problem to minimize the degradation in performance; simulated annealing, genetic algorithms, and integer programming are among the many optimization techniques that have been employed [3]. However, in general, optimization techniques are complex, can require long run times, and provide no performance guarantees. The compensating zeros technique is a straightforward, intuitive method that renders optimization unnecessary; instead, the technique involves the solution of a linear system of equations. It is developed and illustrated with two examples involving real-coefficient FIR filters; the examples depict results for the frequency response as well as hardware speed and size.
Several important figures of merit are used to evaluate the performance of a filter implemented in hardware. The quantized filter design evaluation process begins with the following two metrics:
Once the filter has been implemented in hardware, it can be evaluated more directly. Important metrics from a hardware perspective include: hardware size; throughput (filter outputs per second); and latency (time from filter input to corresponding filter output). The relative importance of these metrics depends on the application.
The goal of the quantized filter design is to achieve a small magnitude MSE while keeping the hardware costs low. In general, the higher the value of T, the closer the quantized filter coefficients are to the unquantized coefficients and the smaller the magnitude MSE. Conversely, smaller T implies worse-magnitude MSE. Hence, there is a trade-off between performance and hardware cost; T can be thought of as the parameter that controls this trade-off.
Filter designs can be implemented in hardware using various structures. The three most common structures are direct, cascade, and lattice. In general, pole-zero, infinite impulse response (IIR) filters are more robust to quantization effects when the cascade and lattice structures are employed; performance degrades quickly when the direct structure is used [1], [2].
For all-zero, FIR filters, the direct structure usually performs well (if the zeros are not very clustered, but are moderately uniformly distributed) [1], [2]. Moreover, since most FIR filters have linear phase (the filter coefficients are symmetric), the lattice structure cannot be used because at least one reflection coefficient equals ±1. Although the direct structure is a good choice for many FIR filter implementations, the cascade structure offers at least one advantage. When an FIR filter is quantized using a direct structure, the quantization of one coefficient affects all of the filter’s zeros. In contrast, if an FIR filter is quantized with a cascade structure, the quantization of coefficients in one of the cascaded sections affects only those zeros in its section—the zeros in the other cascaded sections are isolated and unaffected. Depending on the application, it may be important to more closely approximate the unquantized locations of some zeros than others.
The compensating zeros method uses a cascade structure. However, it goes beyond a “simple quantization” technique that uniformly divvies up the given T non-zero terms in CSD format across the coefficients in the cascaded sections. The next section first illustrates a simple quantization approach for an FIR filter design using a cascade structure; then the compensating zeros method is developed and used to redesign the same FIR filter. The result is an improvement in the magnitude MSE for the same T.
Consider a lowpass, symmetric, length-19 FIR filter designed using a rectangular window. The filter has a normalized (such that a frequency of one corresponds to the sampling rate) passband edge frequency of 0.25 and exhibits linear phase. The floating-point filter coefficients are listed in Table 2–1, and Figure 2–1 shows the unquantized magnitude response of this filter.
Table 2–1 Unquantized Windowed FIR Filter Coefficients, h(n)
n
h
(
n
)
0, 18
0.03536776513153
1, 17
−1.94908591626e–017
2, 16
−0.04547284088340
3, 15
1.94908591626e–017
4, 14
0.06366197723676
5, 13
−1.9490859163e–017
6, 12
−0.10610329539460
7, 11
1.94908591626e–017
8, 10
0.31830988618379
9
0.50000000000000
Figure 2–1 Frequency magnitude response of h(n) for the windowed FIR filter.
Figure 2–2 illustrates the pole-zero plot for h(n). To implement this filter in the cascade form, h(n) is split into two cascaded sections whose coefficients are c1(n) and c2(n). This is accomplished by distributing the zeros of h(n) between c1(n) and c2(n).
Figure 2–2 Pole-zero plot for h(n). The zeros are divided into two cascaded sections by placing the thin zeros in the first section, c1(n), and the bold zeros in the second section, c2(n).
To separate the zeros of h(n) into the two cascaded sections, the z-plane is scanned from ω = 0 to ω = π. As they are encountered, the zeros are placed alternately in the two sections. The first zero encountered is at z = 0.66ej0.324. This zero, its conjugate, and the two reciprocals are put in one section. The next zero at z = 0.69ej0.978, its conjugate, and the reciprocal pair are placed in the other section. This proceeds until all of the zeros of the unquantized filter are divided among the two cascade sections.
The steps in this zero-allocation process are as follows: compute the roots of h(n); partition those roots into two sets of roots as described above; and determine the two sets of coefficients, c1(n) and c2(n), for the two polynomials associated with the two sets of roots. The section with fewer zeros becomes the first section in the cascade, c1(n), and the section with more zeros becomes the second section, c2(n) (this approach provides more degrees of freedom in our design method—see design rule-of-thumb number 6 in Section 2.7).
For the example, the zero allocation is illustrated in Figure 2–2 where the 8 thin zeros go to c1(n), which has length 9, and the 10 bold zeros go to c2(n), which has length 11. This method of splitting up the zeros has the advantage of keeping the zeros relatively spread out within each section, thereby minimizing the quantization effects within each section. Because complex conjugate pairs and reciprocal pairs of zeros are kept together in the same cascade section, the two resulting sections have symmetric, real-valued coefficients. The resulting floating-point cascade c1(n) and c2(n) coefficients are shown in Table 2–2.
Table 2–2 Unquantized Cascaded Coefficients for h(n)
Figure 2–3 depicts the block diagram corresponding to h(n) and the equivalent cascaded form. Coefficients c1(n) and c2(n) are normalized so that the first and last coefficients in each section are 1; this ensures that at least two of the coefficients in each cascade section are efficiently represented in CSD format. Consequently, it is necessary to include a gain factor, k in Table 2–2, following the cascade sections. The magnitude response of the cascaded filter is identical to Figure 2–1.
Figure 2–3 Direct form of h(n) and the equivalent cascade form using c1(n), c2(n), and k.
Now consider a fixed-point, quantized, multiplierless design of this cascade structure so that it can be implemented in fast hardware. Assume that there are a fixed total number of CSD terms, T, for representing the two unquantized cascaded sections and the unquantized gain factor. Two different techniques are considered for quantizing the filter: a simple quantization method that treats each filter section independently, and our proposed compensating zeros method in which the quantization errors in one section are compensated for in the next section.
For the simple quantization method, in the process of distributing a fixed number of CSD terms T to a single cascade section with n coefficients, all reasonable distributions are examined. These “reasonable distributions” consider all of the “mostly uniform” T allocation schemes to n coefficients: all coefficients receive at least one CSD term and the remaining CSD terms are allocated to those coefficients that are most different (in terms of percent different) from their unquantized values. Extremely nonuniform allocation schemes (e.g., one coefficient receives all of the T and the remaining coefficients are set to zero) are not considered.
Of all the distribution schemes examined, the distribution that gives the best result (i.e., the smallest magnitude MSE) is chosen. (Note: This does not require an optimization technique; for reasonably small values of T, it is simple to organize a search that looks in the area around the floating-point coefficients, which is the only area where high-quality solutions lie). This process ensures that there is no better simple quantization scheme for the given cascaded filter.
In applying the simple quantization method to the windowed FIR filter example, the unquantized cascade coefficients, c1(n) and c2(n), are independently quantized to the simple quantized cascade coefficients, and . In this example, a total of T = 25 CSD terms was chosen; this choice results in small hardware while still providing a reasonable approximation to the desired filter. Based on the relative lengths of the sections, 9 CSD terms are used for , 14 terms are used for , and 2 terms are used for the quantized gain factor k′. The resulting simple quantized coefficients are listed in Table 2–3 (in the CSD format, an underline indicates that the power of 2 is to be subtracted instead of added).
Table 2–3 Simple-Quantized Cascaded Coefficients for h(n), (T = 25)
The frequency response of the simple quantized filter, and , is compared with the unquantized filter, h(n), in Figure 2–4. Although a linear phase response is retained after simple quantization (i.e., the simple quantized coefficients are symmetric), the magnitude response is significantly different from the original unquantized case.
Figure 2–4 Frequency responses of the unquantized windowed filter h(n) (dotted), simple quantization (dashed), and compensating zeros quantization (solid).
The proposed compensating zeros quantization method takes the quantization error of the first cascaded section into account when quantizing the second section. The key to this method is the desire that the frequency response of the quantized cascade structure match the frequency response of the original, unquantized direct structure.
Quantization using the compensating zeros method begins with the quantization of the first cascade section c1(n) to and the gain factor k to k′ (using the simple quantization method described in the previous section). Next, instead of quantizing c2(n) to , ccomp(n) is computed such that cascaded with ccomp(n) is as close as possible to the original filter h(n). Coefficients ccomp(n) are called the compensating section, since their aim is to compensate for the performance degradation resulting from the quantization of ; the computation of ccomp(n) is developed below.
If C1(z), C2(z), , , and Ccomp(z) are the transfer functions of c1(n), c2(n), , , and ccomp(n), respectively, then the transfer function of the unquantized cascaded filter H(z) can be written as
(2–1)
where k is the gain factor. The transfer function of the semiquantized filter using the compensating zeros method is given by
(2–2)
is called the semiquantized filter because ccomp(n) has floating-point coefficients. The goal is for the semiquantized and unquantized transfer functions to be equal, that is, , or
(2–3)
In (2–3), C1(z)C2(z) and k on the right-hand side are the known, unquantized cascade filters. After c1(n) and k are quantized, and k′ on the left-hand side are known. Thus, (2–3) can be solved for ccomp(n).
Since the 11-tap ccomp(n) is symmetric (with the first and last coefficients normalized to 1), it can be expressed in terms of only five unknowns. In general, for a length-N symmetric filter with normalized leading and ending coefficients, there are M unique coefficients where M = (N–2)/2. (The x notation means: the next integer larger than x; or if x is an integer, x = x.)
Equation (2–3) can be evaluated at M = 5 values of z to solve for the five unknowns in ccomp(n). Since the frequency response is the primary concern, these values of z are on the unit circle. For the example, (2–3) is solved at the frequencies f = 0, 0.125, 0.2, 0.3, and 0.45 (i.e., z = 1, ej0.25π, ej0.4π, ej0.6π, ej0.9π).
Table 2–4 lists the computed floating-point coefficients of ccomp(n). Now that ccomp(n) has been obtained, it is quantized (using simple quantization and the remaining T) to arrive at . The final, quantized filter coefficients using this compensating zeros method are also given in Table 2–4. Thus the compensating zeros quantized filter coefficients are the and k′ from Table 2–2 in cascade with the in Table 2–4.
Table 2–4 Unquantized ccomp(n) and Compensating Zeros-Quantized for h(n) (T = 25)
The frequency response of the compensating zeros quantized filter is compared with the unquantized filter and the simple quantized filter in Figure 2–4; the overall frequency response of the compensating zeros quantized implementation is closer to the unquantized filter than the simple quantized implementation. The small value of T = 25 employed in this example hampers the ability of even the compensating zeros quantized magnitude response to closely approximate the unquantized magnitude response. The magnitude MSE for compensating zeros quantization (7.347e–3) is an order of magnitude better than the magnitude MSE for simple quantization (4.459e–2).
For a fixed value of T, there are two ways to quantize the cascaded coefficients: simple and compensating zeros. If T is large enough, then simple quantization can achieve the desired frequency response. However, when T is restricted, compensating zeros quantization provides an alternative that outperforms simple quantization. In the example, it turns out that for T = 26, the simple quantization method can achieve the same magnitude MSE as the T = 25 compensating zeros quantization method. This improvement is achieved when the one extra T is assigned to the first cascaded section; no improvement is realized when it is assigned to the second cascaded section. There is an art to how to assign the extra T: These terms should be allocated to the coefficients—in either cascaded section—that are most different (in terms of percent different) from their unquantized values.
The compensating zeros quantization procedure is outlined as follows.
Although the quantized values obtained using the compensating zeros method (in Table 2–4) are not very different from the values obtained using the simple method (in Table 2–3), it is important to note that simple quantization could not possibly find these improved coefficients. In the simple quantization case, the goal is to closely approximate the floating-point c1(n) and c2(n) coefficients in Table 2–2. However, in the compensating zeros quantization case, the goal is to closely approximate a different set of floating-point coefficients—c1(n) and ccomp(n). Figure 2–5 compares the zero location plots of the compensating zeros quantized filter (bold circles) and the unquantized filter (thin circles).
Figure 2–5 Zero plot comparing the unquantized zeros of h(n) (thin circles) and the compensating zeros quantized zeros of h′(n) (bold circles).
Figure 2–5 also shows the zeros z1 and z2 in the first quadrant; recall that z1 is in c1(n) and z2 is in c2(n). The quantization of c1(n) to moves z1 away from the unit circle. Consequently, the compensating zeros quantization method moves z2 toward the unit circle (Table 2–5 shows the distances of z1 and z2 from the origin before and after quantization). This compensating movement of the quantized zeros is the reason the quantized magnitude response more closely resembles the unquantized magnitude response; it also motivates the name of the quantization method.
Table 2–5 Distance of the Zeros in Figure 2–5 from the Origin
|
z
2
|
|
z
1
|
Unquantized
0.66061185
0.69197298
Comp. quantized
0.67774476
0.644315248
The hardware performance of the simple quantized and compensating zeros quantized filters was evaluated by implementing the filters on an Altera field-programmable gate array (FPGA). For each filter, given the desired CSD coefficients, filter synthesis software, written in C, was used to generate a synthesizable description of the filter in VHDL, a hardware description language. The software automatically chooses appropriate bit widths for internal signals such that errors due to truncation and overflow are completely avoided. In this way, an implemented fixed-point filter provides exactly the same outputs as the same filter would in infinite precision, given the quantized, finite precision filter coefficients. Fixed-point, two’s complement adders are used, but the number of integer and fraction bits for each hardware adder are chosen so that no round-off or overflow can occur.
The VHDL description generated is a structural one. A high-performance, multiplierless implementation is achieved. For each section in a filter, a chain of registers is used to shift the data in, and the data is shifted in accordance with the filter coefficients before being summed. For example, if one of the filter coefficients were 18 = 24 + 21, the corresponding data word would go to two shifters and be shifted four and one places, respectively, before being summed. A pipelined tree of carry save adders (CSAs) is used for the summation. The CSA tree produces two outputs that must be summed, or vector-merged, to produce the final filter output. For the results presented here, we use a ripple carry adder for the vector merge, taking care to exploit special-purpose routing (carry chains) provided on Altera FPGAs to make ripple carry addition fast.
Table 2–6 summarizes the hardware performance of the filters. Data formats for all signals are shown as (n,l), where n is the total number of bits, including the sign bit, and 2l is the weight of the least significant bit. Both filters take in inputs of data format (8, 0) (i.e., eight-bit two’s complement integers). They vary in terms of their output data formats, depending on the precision of the coefficients used. Throughput is the most important performance metric; it measures how many inputs can be processed per second. Latency also bears on performance, but is less critical; it measures how many clock cycles a particular set of data takes to pass through the system, from input to corresponding output.
Table 2–6 Hardware Metrics for the Windowed FIR Example
Simple quantized
Compensating zeros quantized
Hardware complexity (logic elements)
1042
996
Throughput (Mresults/second)
82.41
83.93
Latency (clock cycles)
15
15
Input data format
(8, 0)
(8, 0)
Output data format
(23, −13)
(21, −13)
The results of Table 2–6 show that the compensating zeros quantized filter is slightly smaller and faster than the simple quantized filter. This is because the coefficients for the compensating zeros quantized case turn out to be slightly less wide (in terms of bits) than those for the simple quantized case, so that the adders also turn out to be slightly less wide.
Now consider another example that illustrates the cascade structure’s advantage of isolating some of the zeros in an FIR filter. The biorthogonal FIR wavelet filters are best known for their inclusion in the most recent version of the international image compression standard, JPEG2000 [4]; in this example, the 20-tap, biorthogonal, FIR, lowpass analysis wavelet filter is examined [5]. This filter has a passband edge frequency of 0.31 and exhibits linear phase. This filter has 9 zeros clustered at z = −1, and 10 zeros on the right-hand side of the z-plane.
For biorthogonal filters, the stopband magnitude response is characterized by the cluster of zeros at z = −1; it is advantageous to employ the cascade structure to place all of the zeros at z = −1 into an isolated cascade section. This cascade section will keep the zeros exactly at z = −1, have only integer coefficients, and require no quantization. Furthermore, this one large cascade section can be split up into smaller T-friendly sections having 4, 2, or 1 zeros to reduce the total T required. The remaining zeros on the right-hand side of the z-plane determine the passband characteristics and are separated into two cascade sections, c1(n) and c2(n), as before. Here the nine zeros at z = −1 are divided into two sections of four zeros, c3(n) and c4(n), and one section of one zero, c5(n).
Using the simple quantization method, the unquantized cascade sections c1(n) and c2(n) are quantized, but c3(n), c4(n), and c5(n) are captured exactly; a total of T = 38 CSD is chosen. The compensating zeros quantization method follows the procedure outlined above for c1(n) and c2(n); the remaining three cascaded sections are again captured exactly; a total of T = 38 CSD was used.
Figure 2–6 illustrates that the magnitude response of the compensating zeros quantized implementation is closer to the unquantized filter than the simple quantized implementation. The magnitude MSE for compensating zeros quantization (9.231e–3) is an order of magnitude better than the magnitude MSE for simple quantization (8.449e–2). For the biorthogonal FIR filter, it turns out that four extra T, (T = 42), are required for the simple quantization method to achieve the same magnitude MSE as the T = 38 compensating zeros quantization method.
Figure 2–6 Frequency response of the unquantized biorthogonal filter (dotted) compared with the two quantized filters: simple quantization (dashed) and compensating zeros quantization (solid).
The following guidelines recommend how to derive the maximum benefit from the compensating zeros quantization technique.
For hardware filter implementations that use the cascade multiplierless structure, the compensating zeros quantization method outperforms simple quantization given a small fixed-value of T. The compensating zeros technique quantizes the cascaded sections so that the finite word-length effects in one section are guaranteed to compensate for the finite word-length effects in the other section. The algorithm involves no optimization—just the solution of a linear system of equations. Moreover, compensating zeros quantization ensures that: (1) the quantized filter’s frequency response closely matches the unquantized filter’s frequency response (magnitude and phase), and (2) the required hardware remains small and fast. This technique can be applied to any unquantized FIR filter and it can exploit the cascade structure’s ability to isolate some of the zeros in the filter.
2.9 REFERENCES
[1] J. Proakis and D. Manolakis, Digital Signal Processing Principles, Algorithms, and Applications, Prentice Hall, 3rd ed., 1995, pp. 500–652.
[2] A. Oppenheim, R. Schafer, and J. Buck, Discrete-Time Signal Processing, Prentice Hall, 2nd ed., 1999, pp. 366–510.
[3] C. Lim, R. Yang, D. Li, and J. Song, “Signed Power-of-Two Term Allocation Scheme for the Design of Digital Filters,” IEEE Transactions on Circuits and Systems – II: Analog and Digital Signal Proc., vol. 46, no. 5, May 1999, pp. 577–584.
[4] “T.800: Information Technology – JPEG2000 Image Coding System,” ISO/IEC 15444-1:2002. [Online: http://www.itu.int.]
[5] I. Daubechies, “Ten Lectures on Wavelets,” SIAM, 1992, pp. 277.