107,99 €
Digital Design of Signal Processing Systems discusses a spectrum of architectures and methods for effective implementation of algorithms in hardware (HW). Encompassing all facets of the subject this book includes conversion of algorithms from floating-point to fixed-point format, parallel architectures for basic computational blocks, Verilog Hardware Description Language (HDL), SystemVerilog and coding guidelines for synthesis. The book also covers system level design of Multi Processor System on Chip (MPSoC); a consideration of different design methodologies including Network on Chip (NoC) and Kahn Process Network (KPN) based connectivity among processing elements. A special emphasis is placed on implementing streaming applications like a digital communication system in HW. Several novel architectures for implementing commonly used algorithms in signal processing are also revealed. With a comprehensive coverage of topics the book provides an appropriate mix of examples to illustrate the design methodology. Key Features: * A practical guide to designing efficient digital systems, covering the complete spectrum of digital design from a digital signal processing perspective * Provides a full account of HW building blocks and their architectures, while also elaborating effective use of embedded computational resources such as multipliers, adders and memories in FPGAs * Covers a system level architecture using NoC and KPN for streaming applications, giving examples of structuring MATLAB code and its easy mapping in HW for these applications * Explains state machine based and Micro-Program architectures with comprehensive case studies for mapping complex applications The techniques and examples discussed in this book are used in the award winning products from the Center for Advanced Research in Engineering (CARE). Software Defined Radio, 10 Gigabit VoIP monitoring system and Digital Surveillance equipment has respectively won APICTA (Asia Pacific Information and Communication Alliance) awards in 2010 for their unique and effective designs.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 812
Veröffentlichungsjahr: 2011
Contents
Cover
Title Page
Copyright
Preface
Acknowledgments
Chapter 1: Overview
1.1 Introduction
1.2 Fueling the Innovation: Moore's Law
1.3 Digital Systems
1.4 Examples of Digital Systems
1.5 Components of the Digital Design Process
1.6 Competing Objectives in Digital Design
1.7 Synchronous Digital Hardware Systems
1.8 Design Strategies
Chapter 2: Using a Hardware Description Language
2.1 Overview
2.2 About Verilog
2.3 System Design Flow
2.4 Logic Synthesis
2.5 Using the Verilog HDL
2.6 Four Levels of Abstraction
2.7 Verification in Hardware Design
2.8 Example of a Verification Setup
2.9 SystemVerilog
Chapter 3: System Design Flow and Fixed-point Arithmetic
3.1 Overview
3.2 System Design Flow
3.3 Representation of Numbers
3.4 Floating-point Format
3.5 Qn.m Format for Fixed-point Arithmetic
3.6 Floating-point to Fixed-point Conversion
3.7 Block Floating-point Format
3.8 Forms of Digital Filter
References
Chapter 4: Mapping on Fully Dedicated Architecture
4.1 Introduction
4.2 Discrete Real-time Systems
4.3 Synchronous Digital Hardware Systems
4.4 Kahn Process Networks
4.5 Methods of Representing DSP Systems
4.6 Performance Measures
4.7 Fully Dedicated Architecture
4.8 DFG to HW Synthesis
Chapter 5: Design Options for Basic Building Blocks
5.1 Introduction
5.2 Embedded Processors and Arithmetic Units in FPGAs
5.3 Instantiation of Embedded Blocks
5.4 Basic Building Blocks: Introduction
5.5 Adders
5.6 Barrel Shifter
5.7 Carry Save Adders and Compressors
5.8 Parallel Multipliers
5.9 Two's Complement Signed Multiplier
5.10 Compression Trees for Multi-operand Addition
5.11 Algorithm Transformations for CSA
Chapter 6: Multiplier-less Multiplication by Constants
6.1 Introduction
6.2 Canonic Signed Digit Representation
6.3 Minimum Signed Digit Representation
6.4 Multiplication by a Constant in a Signal Processing Algorithm
6.5 Optimized DFG Transformation
6.6 Fully Dedicated Architecture for Direct-form FIR Filter
6.7 Complexity Reduction
6.8 Distributed Arithmetic
6.9 FFT Architecture using FIR Filter Structure
References
Chapter 7: Pipelining, Retiming, Look-ahead Transformation and Polyphase Decomposition
7.1 Introduction
7.2 Pipelining and Retiming
7.3 Digital Design of Feedback Systems
7.4 C-slow Retiming
7.5 Look-ahead Transformation for IIR filters
7.6 Look-ahead Transformation for Generalized IIR Filters
7.7 Polyphase Structure for Decimation and Interpolation Applications
7.8 IIR Filter for Decimation and Interpolation
References
Chapter 8: Unfolding and Folding of Architectures
8.1 Introduction
8.2 Unfolding
8.3 Sampling Rate Considerations
8.4 Unfolding Techniques
8.5 Folding Techniques
8.6 Mathematical Transformation for Folding
8.7 Algorithmic Transformation
References
Chapter 9: Designs based on Finite State Machines
9.1 Introduction
9.2 Examples of Time-shared Architecture Design
9.3 Sequencing and Control
9.4 Algorithmic State Machine Representation
9.5 FSM Optimization for Low Power and Area
9.6 Designing for Testability
9.7 Methods for Reducing Power Dissipation
References
Chapter 10: Micro-programmable State Machines
10.1 Introduction
10.2 Micro-programmed Controller
10.3 Counter-based State Machines
10.4 Subroutine Support
10.5 Nested Subroutine Support
10.6 Nested Loop Support
10.7 Examples
References
Chapter 11: Micro-programmed Adaptive Filtering Applications
11.1 Introduction
11.2 Adaptive Filter Configurations
11.3 Adaptive Algorithms
11.4 Channel Equalizer using NLMS
11.5 Echo Canceller
11.6 Adaptive Algorithms with Micro-programmed State Machines
References
Chapter 12: CORDIC-based DDFS Architectures
12.1 Introduction
12.2 Direct Digital Frequency Synthesizer
12.3 Design of a Basic DDFS
12.4 The CORDIC Algorithm
12.5 Hardware Mapping of Modified CORDIC Algorithm
References
Chapter 13: Digital Design of Communication Systems
13.1 Introduction
13.2 Top-Level Design Options
13.3 Typical Digital Communication System
References
Index
This edition first published 2011
© 2011 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does not warrant the accuracy of the text or exercises in this book. This book's use or discussion of MATLAB® software or related products does not constitute endorsement or sponsorship by The MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.
Library of Congress Cataloguing-in-Publication Data
Khan, Shoab Ahmed.
Digital design of signal processing systems : a practical approach / Shoab Ahmed Khan.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-74183-2 (cloth)
1. Signal processing–Digital techniques. I. Title.
TK5102.9.K484 2010
621.382'2–dc22
2010026285
A catalogue record for this book is available from the British Library.
Print ISBN: 9780470741832 [HB]
ePDF ISBN: 9780470974698
oBook ISBN: 9780470974681
ePub ISBN: 9780470975251
Preface
Practising digital signal processing and digital system design for many years, and introducing and then developing the contents of courses at undergraduate and graduate levels, tempted me to write a book that would cover the entire spectrum of digital design from the signal processing perspective. The objective was to develop the contents such that a student, after taking the course, would be productive in an industrial setting in different roles. He or she could be a good algorithm developer, a digital designer and a verification engineer. An associated website (www.drshoabkhan.com) hosts RTL Verilog code of the examples in the book. Readers can also download PDF files of Microsoft PowerPoint presentations of lectures covering the material in the book. The lab exercises are provided for teachers' support.
The contents of this book show how to code algorithms in high-level languages in a way that facilitates their subsequent mapping on hardware-specific platforms. The book covers issues in implementing algorithms using fixed-point format. The ultimate conversion of algorithms developed in double-precision floating-point format to fixed-point is a critical design stage in system implementation. The conversion not only requires simple translation but in many cases also requires the designer to explore other structural options for mitigating quantization effects of fixed-point implementation. A number of commercially available system design and simulation tools provide support for fixed-point conversion and simulation. The MATLAB® fixed-point toolbox and utilities are important, and so is the support extended for fixed-point arithmetic in other high-level description languages such as SystemC. The issues of overflow, saturation, truncation and rounding are critical. The normalization and block floating-point option to optimize implementation should also be learnt. Chapter covers all these issues and demonstrates the methodology with examples.
The next step in system design is to perform HW–SW partitioning. Usually this decision is made by an experienced designer. Chapters 1 and give broad rules that help the designer to make this decision. The portion that is set aside for mapping in hardware is then explored for several architectural design options. Different ways of representing algorithms and their coding in MATLAB® are covered in Chapter 4. The chapter also covers mapping of the graphical representation of an algorithm on fully dedicated hardware.
Following the discussion on fully dedicated architectures, Chapter 5 lists designs of basic computational blocks. The chapter also highlights the architecture of embedded computational units in FPGAs and their effective use in the design. This discussion logically extends to algorithms that require multiplications with constants.
Chapter 6 gives an account of architectural optimization for designs that involve multiplications with constants. Depending on the throughput requirement and the target technology that constrains the clock rate, the architectural design decisions are made. Mapping an application written in a high-level language to hardware requires insight into the algorithm. Usually signal processing applications use nested loops. Unfolding and folding techniques are presented in Chapter 7. These techniques are discussed for code written in high-level languages and for algorithms that are described graphically as a dataflow graph (DFG). Chapter 4 covers the representation of algorithms as dataflow graphs. Different classes of DFGs are discussed. Many top-level design options are also discussed in Chapter 4 and Chapter 13. These options include a peer-to-peer KPN-connected network, shared bus-based design, and network-on-chip (NoC) based architectures. The top-level design is critical in overall performance, easy programmability and verification.
In Chapter 13 a complex application is considered as a network of connected processing elements (PEs). The PEs implement the functionality in an algorithm whereas the interconnection framework provides inter-PE communication. Issues of different scheduling techniques are discussed. These techniques affect the requirements of buffers between two connected nodes.
While discussing the hardware mapping of functionality in a PE, several design options are considered. These options include fully dedicated architecture (Chapter 4 and Chapter 6), parallel and unfolded architectures (Chapter 8), folded and time-shared architectures (Chapter 8 and Chapter 9) and programmable instruction set architectures (Chapter 10). Each architectural design option is discussed in detail with examples. Tradeoffs are also specified for the designer to gauge preferences of one over the other. Special consideration is given to the target platform. Examples of FGPAs with embedded blocks of multipliers with a fixed set of registers are discussed in Chapter 5.
Mapping of an algorithm in hardware must take into account the target technology. Novel methodologies for designing optimal architectures that meet stringent design constraints while keeping in perspective the target technology are elaborated. For a time-shared design, systolic and simple folded architectures are covered. Intricacies in folding a design usually require a dedicated controller to schedule operands on a shared HW resource. The controller is implemented as a finite state machine (FSM). FSM representations and designs are covered in Chapter 9. The chapter gives design examples. The testing of complex FSMs requires a lot of thought. Different coverage metrics are listed. Techniques are described that ensure maximum path coverage for effective testing of FSMs. For many complex applications, the designer has an option to define an instruction set that can effectively implement the application. A micro-programmed state machine design is covered in Chapter 10. Design examples are given to demonstrate the effectiveness of this design option. The designs are coded in RTL Verilog. The designer must know the coding rules and RTL guidelines from a synthesis perspective. Verilog HDL is covered, with mention of the guidelines for effective coding, in Chapter 2. This chapter also gives a brief description of SystemVerilog that primarily facilities testing and verification of the design. It also helps in modeling and simulating a system at higher levels of abstraction especially at transaction levels. Features of SystemVerilog that help in writing an effective stimulus are also given. For many examples, the RTL Verilog code is also listed with synthesis results. The book also provides an example of a communication receiver.
Two case studies of designs are discussed in detail. Chapter presents an instruction set for implementing an adaptive algorithm for computationally intensive applications. Several architectural options are explored that trade off area with performance. Chapter explores design options for a CORDIC-based DDFS algorithm. The chapter provides MATLAB® implementation of the basic CORDIC algorithm, and then explores fully parallel and folding architecture for implementing the CORDIC algorithm.
The book presents novel architectures for signal processing applications. Chapter 7 presents novel IIR filter-based decimation and interpolation designs. IIR filters are traditionally not used in these applications because for computing the current output sample they require previous output samples, so all samples need to be computed. This requires running the design at a faster clock for decimation and interpolation applications. The transformations are defined that only require an IIR filter to compute samples at a slower rate in decimation and interpolation applications.
In Chapter 10, the design of a DDFS based on the CORDIC algorithm is given. The chapter also presents a complete working of a novel design that requires only one stage of a CORDIC element and computes sine and cosine values. Then in Chapter 13 a novel design of time-shared and systolic AES architecture is presented. These architectures transform the AES algorithm to fit in an 8-bit datapath. Several innovative techniques are used to reduce the hardware complexity and memory requirements while enhancing the throughput performance of the design. Similarly novel architectures for massively parallel data compression applications are also covered.
The book can be adopted for a number of courses at senior undergraduate and graduate levels. It can be used for a senior undergraduate course on Advanced Digital Design and VLSI Signal Processing. Similarly the contents can be selected in a way to form a graduate level course in these two subjects.
Acknowledgments
I started my graduate studies at the Georgia Institute of Technology, Atlanta, USA, in January 1991. My area of specialization was digital signal processing. At the institute most of the core courses in signal processing were taught by teachers who had authored textbooks on the subjects. Dr Ronald W. Schafer taught “digital signal processing” using Discrete Time Signal Processing by Oppenheim, Schafer and Buck. Dr Monson H. Hayes taught “advanced signal processing” using his book, Statistical Digital Signal Processing and Modeling. Dr Vijay K. Madisetti taught from his book, VLSI Digital Signal Processing, and Dr Russell M. Mersereau taught from Multi Dimensional Signal Processing by Dudgeon and Mersereau. During the semester when I took a course with Dr Hayes, he was in the process of finalizing his book and would give different chapters of the draft to students as text material. I would always find my advisor, Dr Madisetti, burning the midnight oil while working on his new book.
The seed of desire to write a book in my area of interest was sowed in my heart in those days. After my graduation I had several opportunities to work on real projects in some of the finest engineering organizations in the USA. I worked for Ingersoll Rand, Scientific Atlanta, Picture Tel and Cisco Systems. I returned to Pakistan in January 1997 and started teaching in the Department of Computer Engineering at the College of Electrical and Mechanical Engineering (E&ME), National University of Sciences and Technology (NUST), while I was still working for Cisco Systems.
In September 1999, along with two friends in the USA, Raheel Ahmed Khan and Sherjil Ahmed, founded a startup company called Communications Enabling Technologies (or CET, later named Avaz Networks Inc. USA). Raheel Khan is a genuine designer of digital systems. Back in 1999 and 2000, we designed a few systems and technical discussions with him further increased my liking and affection, along with strengthening my comprehension of the subject. I was serving as CTO of the company and also heading the R&D team in Pakistan.
In 2001 we secured US$17 million in venture funding and embarked on developing what was at that time the highest density media processor system-on-chip (MPSoC) solution for VoIP carrier-class equipment. The single chip could process 2014 channels of VoIP, performing DTMF generation and detection, line echo cancellation (LEC), and voice compression and decompression on all these channels. I designed the top-level architecture of the chip and all instruction set processors, and headed a team of 160 engineers and scientists to implement the system. Designing, implementing and testing such a complex VLSI system helped me to understand the intricacies of the field of digital design. We were able to complete the design cycle in the short period of 10 months.
At the time we were busy in chip designing, I, with my friends at Avaz also founded the Center for Advanced Studies in Engineering (CASE), a postgraduate engineering program in computer engineering. I introduced the subject of “advanced digital design” at CASE and NUST. I would always teach this subject, and to compose the course contents I would collect material from research papers, reference books and the projects we were undertaking at Avaz. However, I could not find a book that did justice to this emerging field.
So it was in 2002 that I found myself compelled to write a textbook on this subject. In CASE all lectures are recorded on videos for its distance-learning program. I asked Sandeela Sameem, my student and an intern at Avaz, to make a Microsoft PowerPoint presentation from the design I drew and text I wrote on the whiteboard. These presentations served as the initial material for me to start writing the book. The course was offered once every year and in that semester I would write a little on a few topics, and would give the material to my students as reference.
In 2004, I with my core team of CET founded a research organization called the Center for Advanced Research in Engineering (CARE) and since its inception I have been managing the organization as CEO. At CARE, I have had several opportunities to participate in the design of machine vision, network analysis and digital communication systems. The techniques and examples discussed in this book are used in the award winning products from CARE. Software Defined Radio, 10 Gigabit VoIP monitoring system and Digital Surveillance equipment received APICTA (Asia Pacific Information and Communication Alliance) awards in September 2010 for their unique and effective designs.
My commitments as a professor at NUST and CASE, and as CEO of CARE did not allow me enough time to complete my dream of writing a book on the subject but my determination did not falter. Finally in March of 2008 I forwarded my proposal to John Wiley and Sons. In July 2008, I formally started work on the book.
I have been fortunate to have motivated students to help me in formatting the text and drawing the figures. Initially it was my PhD student, Fozia Noor Khan, who took pains to put the material in a good format and convert my hand-drawn figures into Visio images. Later this task was taken over by Hussnain Ali. He also read the text and highlighted areas that might need my attention. Finally it was my assistant, Shaista Zainab, who took over helping me in putting the manuscript in order. Also, many of my students helped in exploring areas. Among them were Zaheer Ahmed, Mohammad Mohsin Rahmatullah, Sheikh M. Farhan, Ummar Farooq and Rizwana Mehboob, who completed their PhDs under my co-supervision or supervision. There are almost 70 students who worked on their MS theses under my direct supervision working on areas related to digital system design. I am also deeply indebted to Dr Faisal Durbai for spending time reading a few chapters and suggesting improvements. My research associates, Usman Akram and Sajid, also extended their help in giving a careful reading of the text.
I should like to acknowledge a number of young and enthusiastic engineers who opted to work for us in Avaz and CARE, contributing to the development of several first-time-right ASICs and FPGA-based complex systems. Their hard work and dedication helped to enlighten my approach to the subject. A few names I should like to mention are Imran Qasir (IQ), Rahan Hameed, Nuaman, Mobeen, Hassan, Aeman Bukhari, Mahreen, Sadia, Hamza, Fahad Ali Mujahaid, Usman, Asim Munawar, Aysha Khalid, Alina Mufti, Hammood, Wajahat, Arsalan, Durdana, Shahbaz, Masood, Adnan, Khalid, Rabia Anwar and Javaria.
Thanks go to my collegues for keeping me motivated to complete the manuscript: Dr Habibullah Jamal, Dr Shamim Baig, Dr Abdul Khaliq, Hammad Khan, Dr Muhammad Younis Javed, Asrar Ashraf, Dr Akhtar Nawaz Malik, Gen Muhammad Shahid, Dr Farrukh Kamran, Dr Saeed Ur Rahman, Dr Ismail Shah and Dr Sohail Naqvi.
Last but not least, my parents, brother, sisters and my wife Nuzhat who has given me support all through the years with love and compassion. My sons Hamza, Zaid and Talha grew from toddlers to teenagers seeing me taking on and then working on this book. My daughter Amina has consistently asked when I would finish the book!
Chapter 1
Overview
No exponential is forever … but we can delay “forever”
Gordon Moore
1.1 Introduction
This chapter begins from the assertion that the advent of VLSI (very large scale integration) has enabled solutions to intractable engineering problems. Gordon Moore predicted in 1965 the rate of development of VLSI technology, and the industry has indeed been developing newer technologies riding on his predicted curve. This rapid advancement has led to new dimensions in the core subject of VLSI. The capability to place billions of transistors in a small silicon area has tested the creativity of engineers and scientists around the world. The subject of digital design for signal processing systems embraces these new challenges. VLSI has revolutionized the commercial market, with products regularly appearing with increasing computational power, improved battery life and reduced physical size.
This chapter discusses several applications. The focus of the book is on applications primarily in areas of signal processing, multimedia, digital communication, computer networks and data security. Some of the applications are shown in Figure 1.1.
Figure 1.1 VLSI technology plays a critical role in realizing real-time signal processing systems
Multimedia applications have had a dramatic impact on our lives. Multimedia access on handheld devices such as mobile phones and digital cameras is a direct consequence of this technology.
Another area of application is high-data-rate communication systems. These systems have enormous real-time computational requirements. A modern mobile phone, for example, executes several complex algorithms, including speech compression and decompression, forward error-correction encoding and decoding, highly complex modulation and demodulation schemes, up-conversion and down-conversion of modulated and received signals, and so on. If these are implemented in software, the amount of real-time computation may require the power of a supercomputer. Advancement in VLSI technology has made it possible to conveniently accomplish the required computations in a hand-held device. We are also witnessing the dawn of new trends like wearable computing, owing much to this technology.
Broadband wireless access technology, processing many megabits of information per second, is another impressive display of the technology, enabling mobility to almost all the services currently running on desktop computers. The technology is also at work in spacecraft and satellites in space imaging applications.
The technology is finding uses in biomedical equipment, examples being digital production of radiographic and ultrasound images, and implantable devices such as the cardioverter defibrillator that acquires and digitizes heartbeats, detects any rhythmic abnormalities and symptoms of sudden cardiac arrest and applies an electric shock to help a failing heart.
This chapter selects a mobile communication system as an example to explain the design partitioning issues. It highlights that digital design is effective for mapping structured algorithms in silicon. The chapter also considers the design of a backplane of a high-end router to reveal the versatility of the techniques covered in this book to solve problems in related areas where performance is of prime importance.
The design process has to explore competing design objectives: speed, area, power, timing and so on. There are several mathematical transformations to help with this. Keeping in perspective the defined requirement specifications, transformations are applied that trade off less relevant design objectives against the other more important objectives. That said, for complex design problems these mathematical transformations are of less help, so an effective approach requires learning several ‘tricks of the trade’. This book aims to introduce the transformations as well as giving tips for effective design.
The chapter highlights the impact of the initial ideas on the entire design process. It explains that the effect of design decisions diminishes as the design proceeds from concept to implementation. It establishes the rational for the system architect to positively impact the design process in the right direction by selecting the best option in the multidimensional design space. The chapter explores the spectrum of design options and technologies available to the designer. The design options range from the most flexible general-purpose computing machine like Pentium, to commercially available off-the-shelf digital signal processors (DSPs), to more application-specific instruction-set processors, to hard-wired application-specific designs yielding best performance without any consideration of flexibility in the solution. The chapter describes the target technologies on which the solution can be mapped, like general-purpose processors (GPPs), DSPs, application-specific integrated circuits (ASICs), and field-programmable gate arrays (FPGAs). It is established that, for complex applications, an optimal solution usually consists of a mix of these target technologies.
This chapter presents some design examples. The rationale for design decisions for a satellite burst modem receiver is described. There is a brief overview of the design of the backplane of a router. There is an explanation of the design of a network-on-chip (NoC) carrier-class VoIP media gateway. These examples follow a description of the trend from digital-only design to mixed-signal system-on-chips (SoCs). The chapter considers synchronous digital circuits where digital clocks are employed to make all components operate synchronously in implementing the design.
1.2 Fueling the Innovation: Moore's Law
Advancements in VLSI over a few decades have played a critical role in realizing the amazing electronic gadgets we live with today. Gordon Moore, founder of Intel, earlier predicted the rapid rate of these advancements. In 1965 he noted that the number of transistors on a chip was doubling every 18 to 24 months. Figure 1.2a shows the predicted curve known as Moore's Law from his original paper [1]. This ‘law’ has fueled innovation for five decades. Figure 1.2b shows Intel's response to his prediction.
Figure 1.2 (a) The original prediction of Moore's Law. (b) Intel's response to Moore's prediction
Moore acknowledges that the trend cannot last forever, and he gave a presentation at an international conference, entitled “No exponential is forever, but we can delay ‘forever’” [2]. Intel has plans to continue riding on the Moore's Law curve for another ten years and has announced a 2.9 billion-transistor chip for the second quarter of 2011. The chip will fit into an area the size of a fingernail and use 22-nanometer technology [3]. For comparison, the Intel 4004 microprocessor introduced in 1971 was based on a 10 000-nanometer process.
Integration at this scale promises enormous scope for designers and developers, and the development of design tools has matched the pace. These tools provide a level of abstraction so that the designer can focus more on higher level design concepts rather than low-level details.
1.3 Digital Systems
1.3.1 Principles
To examine the scope of the subject of digital design, let us consider an embedded signal processing system of medium complexity. Usually such a system consists of heterogeneous physical devices such as a GPP or micro-controller, multiple DSPs, a few ASICs, and FPGAs. An application implemented on such a system usually consists of numerous tasks of varying computational complexity. These tasks are mapped on to the physical devices. The decision to implement a particular task on a particular device is based on the computational complexity of the task, its code density, and the communication it requires with other tasks.
The computationally intensive (‘number-crunching’) tasks of the application can be further divided into categories. The tasks for which commercial off-the-shelf ASICs are available are best mapped on these devices. ASICs are designed to perform specific functions of a particular application and are not programmable, as are GPPs. Based on the target technology, ASICs are of different types, examples of which are full-custom, standard-cell-based, gate-array-based, channeled gate array, channel-less gate array, and structured gate array. As these devices are application-specific they are optimized using integrated-circuit manufacturing process technology. These devices offer low cost and low power consumption. There are many benefits to using ASICs, but because of their fixed implementation a design cannot be made easily upgradable.
It is important to point out that several applications implement computationally intensive but non-standard algorithms. The designer, for these applications, may find that mapping the entire application on FPGAs is the only option for effective implementation. For applications that consist of standard as well as non-standard algorithms, the computationally intensive tasks are further divided into two groups: structured and non-structured. The tasks in the structured group usually consist of code that has loops or nested loops with a few instructions being repeated a number of times, whereas the tasks in the non-structured group implement more code-intensive components. The structured tasks are effectively mapped on FPGAs, while the non-structured parts of the algorithm are implemented on a DSP or multiple DSPs.
A field-programmable gate array comprises a matrix of configurable logic blocks (CLBs) embedded in an interconnected net. The FPGA synthesis tools provide a method of programming the configurable logic and the interconnects. The FPGAs are bought off the shelf: Xilinx [4], Altera [5], Atmel [6], Lattice Semiconductor [7], Actel [8] and QuickLogic [9] are some of the prominent vendors. Xilinx shares more than 50% of the programmable logic device (PLD) segment of the semiconductor industry.
FPGAs offer design reuse, and better performance than a software solution mapped on a DSP or GPP. They are, however, more expensive and give reduced performance and more power consumption compared with an equivalent ASIC solution if it exists. The DSP, on the other hand, is a microprocessor whose architecture is specially designed to support number-crunching signal processing applications. Usually a DSP can perform many multiplication and addition operations and supports special addressing modes that help in effective implementation of fast Fourier transform (FFT) and convolution algorithms.
The GPPs or microcontrollers are general-purpose computing machines. Types are ‘complex instruction set computer’ (CISC) and ‘reduced instruction set computer’ (RISC).
The tasks specific to user interfaces, control processes and other code-intensive protocols are usually mapped on GPPs or microcontrollers. For handling multiple concurrent tasks, events and interrupts, the microcontroller runs a real-time operating system. The GPP is also good at performing general tasks like configuring various devices in the system and interfacing with external devices. The microcontroller or GPP performs the job of a system controller. For systems of medium complexity, it is connected to a shared bus. The processor configures the ASICs and FGPAs, and also bootstraps the DSPs in the system. A high-speed bus like Amba High-speed Bus (AHB) is used in these systems [10]. The shared-bus protocol allows only one master to transfer the data. For designs that require parallel transfer of data, a multi-layer shared bus like Multi-Layer AHB (ML-AHB) is used [11]. The microcontroller also interfaces with the external displays and control panels.
The digital design of a digital communication system interfaces with the RF front end. For voice-based applications the system also contains CODEC (more in Chapter 12) with associated analog interfaces. The FPAGs in the system also provide glue logic and interfaces with other devices in the system. There may also be dual-port RAM to provide shared memory to multiple DSPs in the system. A representative system is shown in Figure 1.3.
Figure 1.3 An embedded signal processing system with DSPs, FPGAs, ASICs and GPP
1.3.2 Multi-core Systems
Many applications are best mapped on general-purpose processors. As high-end computing applications demand more and more computational power in programmable devices, the vendors of GPPs are incorporating multiple cores of GPPs in a single SoC configuration. Almost all the vendors of GPPs, such as Intel, IBM, Sun Microsystems and AMD, are now placing multiple cores on a single chip for improved performance and high reliability. Examples are Intel's Yorkfield 8-core chip in 45-nm technology, Intel's 80-core teraflop processor, Sun's Rock 8-core CPU, Sun's UltraSPARC T1 8-core CPU, and IBM's 8-core POWER7. These multi-core solutions also offer the necessary abstraction, whereby the programmer need not be concerned with the underlying complex architecture, and software development tools have been produced that partition and map applications on these multiple cores. This trend is continuously adding complexity to digital design and software tool development. From the digital design perspective, multi processors based systems are required to communicate with each other, and inter-processors connections need to be scalable and expendable. The network-on-chip (NoC) design paradigm addresses issues of scalability of on-chip connectivity and inter-processor communication.
1.3.3 NoC-based MPSoC
Besides GPP-based multi-core SoCs for mapping general computing applications, there also exist other application-specific SoC solutions. An SoC integrates all components of a system in a single chipset. That includes microprocessor, application-specific accelerators, all interfaces to memory and peripheral devices, and so on.
Most high-end signal processing applications offer an inherent parallelism. To exploit this parallelism, these systems are mapped on multiple heterogeneous processors. Traditionally these processors are connected with shared memories on shared buses. As complex designs are integrating an increasing number of multi-processors on a single SoC (MPSoC) [12], designs based on a shared bus are not effective owing to complex arbitration, clock skews and latency issues. These designs require scalable and effective communication infrastructure. An NoC offers a good solution to these problems [13]. The NOC provides higher bandwidth, low latency, modularity, scalability, and a high level of abstraction to the system. The complex bus protocols route wires to connect various components, whereas an NOC uses packet-based protocols to provide connectivity among components. The NoC enables parallel transactions of data.
The basic architecture of an NOC is shown in Figure 1.4. Each processing element (PE) is connected to an on-chip router via a network interface (NI) controller.
Figure 1.4 An NoC-based heterogeneous multi-core SoC design
Many vendors are now using NoC to integrate multiple PEs on a single chip. A good example is the use of NoC technology in the Play Station 3 (PS3) system by Sony Entertainment. A detailed design of an NoC-based system is given in Chapter 13.
1.4 Examples of Digital Systems
1.4.1 Digital Receiver for a Voice Communication System
A typical digital communication system for voice, such as a GSM mobile phone, executes a combination of algorithms of various types. A system-level block representation of these algorithms is shown in Figure 1.5. These algorithms fall into the following categories.
1.Code-intensive algorithms. These do not have repeated code. This category consists of code for phone book management, keyboard interface, GSM (Global System for Mobile) protocol stack, and the code for configuring different devices in the system.
Figure 1.5 Algorithms in a GSM transmitter and receiver and their mapping on to conventional target technologies consisting of ASIC, FPGA and DSP
2.Structured and computationally intensive algorithms. These mostly take loops in software and are excellent candidates for hardware mapping. These algorithms consist of digital up- and down-conversion, demodulation and synchronization loops, and forward error correction (FEC).
3.Code-intensive and computationally intensive algorithms. These lack any regular structure. Although their implementations do have loops, they are code-intensive. Speech compression is an example.
The GSM is an interesting example of a modern electronic device as most of these devices implement applications that comprise these three types of algorithm. Some examples are DVD players, digital cameras and medical diagnostic systems.
The mapping decisions on target technologies are taken at a system level. The code-intensive part is mapped on a microcontroller; the structured parts of computationally intensive components of the application, if consisting of standard algorithms, are mapped on ASICs or otherwise they are implemented on FPGAs; and the computational and code-intensive parts are mapped on DSPs.
It is also important to note that only signals that can be acquired using an analog-to-digital (A/D) converter are implemented in digital hardware (HW) or software (SW), whereas the signal that does not meet the Nyquist sampling criterion can be processed only using analog circuitry. This sampling criterion requires the sampling rate of an A/D converter to be double the maximum frequency or bandwidth of the signal. A consumer electronic device like a mobile phone can only afford to have an A/D converter in the range 20 to 140 million samples per second (Msps). This constraint requires analog circuitry to process the RF signal at 900 MHz and bring it down to the 10–70 MHz range. After conversion of this to a digital signal by an A/D converter, it can be easily processed. A conventional mapping of different building blocks of a voice communication system is shown in Figure 1.5.
It is pertinent to mention that, if the volume production of the designed system are quite high, a mixed-signal SoC is the option of choice. In a mixed-signal SoC, the analog and digital components are all mapped on a single silicon device. Then, instead of placing physical components, the designer acquires soft cores or hard cores of the constituent components and integrates them on a single chip.
An SoC solution for the voice communication system is shown in Figure 1.6. The RF microcontroller, DSP and ASIC logic with on-chip RAM and requisite interfaces are all integrated on the same chip. A system controller controls all the interfaces and provides glue logic for all the components to communicate with each other on the device.
Figure 1.6 A system-on-chip solution
1.4.2 The Backplane of a Router
A router consists mainly of two parts or planes, a control/management plane and a data plane. A code-intensive control or management plane implements the routing algorithms. These algorithms are executed only periodically, so they are not time-critical.
In contrast, the data plane of the router implements forwarding. A routing algorithm updates the routing table, after which a forwarding logic uses this table to transfer data from input ports to output ports. This forwarding logic is very critical as it executes all the time, and is implemented as the data plane. This plane checks the packet header of the inbound packets and, from a lookup table, finds its destination port. This operation is performed on all the data packets received by the router and is very well structured and computationally intensive. For routers supporting gigabit or multi-gigabit rates, this part is usually implemented in hardware [14], whereas the routing algorithms are mapped in software as they are code-intensive.
These planes and their effective mappings are shown in Figure 1.7.
Figure 1.7 HW-SW partitioning of control and data plane in a router
1.5 Components of the Digital Design Process
A thorough understanding of the main components of the digital design process is important. The subsequent chapters of this book elaborate on these components, so they are discussed only briefly here.
1.5.1 Design
The ‘design’ is the most critical component of the digital design process. The top-level design highlights the partitioning of the system into its various components. Each component is further defined at the register transfer level (RTL). This is a level of abstraction where the digital designer specifies all the registers and elaborates how data will flow through these registers. The combinational logic between two sets of registers is usually described using high-level mathematical operations, and is drawn as a cloud.
1.5.2 Implementation
When the design has been described at RTL level, its implementation is usually a straightforward translation in a hardware description language (HDL) program. The program is then synthesized for mapping on an FPGA or ASIC implementation.
1.5.3 Verification
As the number of gates on a single silicon device increases, so do the challenges of verification. Verification is also critical in VLSI design as there is hardly any tolerance for bugs in the hardware. With application-specific integrated circuits, a bug may require a re-spin of fabrication, which is expensive, so it is important for an ASIC to be ‘right first time’. Even bugs found in FPGA-based designs result in extended design cycles.
1.6 Competing Objectives in Digital Design
To achieve an effective design, a designer needs to explore the design space for tradeoffs of competing design objectives. The following are some of the most critical design objectives the designer needs to consider:
areacritical path delaystestabilitypower dissipation.The art of digital design is to find the optimal tradeoff among these. These objectives are competing because, for example, if the designer tries to minimize area then the design may result in longer critical paths and may also affect the testability of the design. Similarly, if the design as synthesized for better timing means shorter critical paths, the design may result in a larger area. Better timing also means more power dissipation, which depends directly on the clock frequency. It is these competing objectives that make learning the techniques covered in this book very pertinent for designers.
1.7 Synchronous Digital Hardware Systems
The subject of digital design has many aspects. For example, the circuit may be synchronous or asynchronous, and it may be analog or digital. A digital synchronous circuit is always an option of choice for the designer. In synchronous digital hardware, all changes in the system are controlled by one or multiple clocks. In digital systems, all inputs/outputs and internal values can take only discrete values.
Figure 1.8 depicts an all-digital synchronous circuit in which all changes in the system are controlled by a global clock clk. A synchronous circuit has a number of registers, and values in these registers are updated at the occurrence of positive or negative edges of the clock signal. The figure shows positive-edge triggered registers. The output signal from the registers R0 and R1 are fed to the combinational logic. The signal goes through the combinational logic which consists of gates. Each gate causes some delay to the input signal. The accumulated delay on each path must be smaller than the time period of the clock, because the signal at the input of R2 register must be stable before the arrival of the next active edge of the clock. As there are a number of paths in any digital design, the longest path – the path that takes the maximum time for the signal to settle at the output – is called the critical path, as noted in Figure 1.8. The critical path of the design should be smaller than the permissible delay determined by the clock cycle.
Figure 1.8 Example of a digital synchronous hardware system
1.8 Design Strategies
At the system level, the designer has a spectrum of design options as shown in Figure 1.9. It is very critical for the system designer to make good design choices at the conceptual level because they will have a deep impact on the rest of the design cycle.
At the system design stage the designer needs only to draw a few boxes and take major design decisions like algorithm partitioning and target technology selection.
Figure 1.9 Target technologies plotted against flexibility and power consumption
If flexibility in programming is required, and the computational complexity of the application is low, and cost is not a serious consideration, then a general-purpose processor such as Intel's Pentium is a good option. In contrast, while implementing computationally intensive non-structured algorithms, flexibility in terms of programming is usually a serious consideration, and then a DSP should be the technology of choice.
In many applications the algorithms are computationally intensive but are also structured. This is usually the case in image and video processing applications, or a high-data-rate digital communication receiver. In these types of application the algorithms can be mapped on FPGAs or ASICs. While implementing algorithms on FPGAs there are usually two choices. One option is to design an application-specific instruction-set processor (ASIP). This type of processor is programmable but has little flexibility and can port only the class of applications using its application-specific instruction set. In the extreme case where performance is the only consideration and flexibility is not required, the designer should choose a second option, whereby the design is dedicated to that particular application and logic is hardwired without giving any consideration to flexibility. This book discusses these two design options in detail.
The performance versus flexibility tradeoff is shown in Figure 1.10. It is interesting to note that, in many high-end systems, usually all the design options are exercised. The code-intensive part of the application is mapped on GPPs, non-structured signal processing algorithms are mapped on DSPs, and structured algorithms are mapped on FPGAs, whereas for standard algorithms ASICs are used. This point is further elaborated in the design examples later.
Figure 1.10 Efficiency verses flexibility tradeoff while selecting a design option
These apparently simple decisions are very critical once the system proceeds along the design cycle. The decisions are especially significant for the blocks that are partitioned for hardware mapping. The algorithms are analyzed and architectures are designed. The designer either selects ASIP or dedicated hard-wired. The designer takes the high-level design and starts implementing the hardware. The digital design is implemented at RTL level and then it is synthesized and tools translate the code to gate level. The synthesized design is physically placed and routed. As the design goes along the design cycle, the details are very complex to comprehend and change. The designer at every stage has to make decisions, but as the design moves further along the cycle these decisions have less impact on the overall function and performance of the design. The relationship between the impact of the design decision and its associated complexity is shown in Figure 1.11.
Figure 1.11 Design decision impact and complexity relationship diagram
1.8.1 Example of Design Partitioning
Let us consider an example that elaborates on the rationale of mapping a communication system on a hybrid platform. The system implements an upto 512Kbps BPSK/QPSK (phase-shift keying) satellite burst modem.
The design process starts with the development of an algorithm in MATLAB®. The code is then profiled. The algorithm consists of various components. The computation and storage requirements of each component along with inter-component communication are analyzed. The following is a list of operations that the digital receiver performs.
Analog to digital conversion (ADC) of an IF signal at 70 MHz at the receiver (Rx) using band-pass sampling.Digital to analog conversion (DAC) of an IF signal at 24.5 MHz at the transmitter (Tx).Digital down-conversion of the band-pass digitized IF signal to baseband at the Rx. The baseband signal consists of four samples per symbol on each I and Q channel. For 512 Kbps this makes 2014 Ksps (kilo samples per second) on both the channels.Digital up-conversion of the baseband signal from 2014 ksps at both I and Q to 80 Msps at the Tx.Digital demodulator processing 1024 K complex samples per second. The algorithm at the Rx consists of: start of burst detection, header removal, frequency and timing loops and slicer.In a burst modem, the receiver starts in burst detection state. In this state the system executes the start of the burst detection algorithm. A buffer of data is input to the function that computes some measure of presence of the burst. If the measure is greater than a threshold, ‘start of burst’ (SoB) is declared. In this state the system also detects the unique word (UW) in the transmitted burst and identifies the start of data. If the UW in the received burst is not detected, the algorithm transits back into the burst detection mode. When both the burst and the UW are detected, then the algorithm transits to the estimation state. In this state the algorithm estimates amplitude, timing, frequency and phase errors using the known header placed in the transmitted burst. The algorithm then transits to the demodulation state. In this state the system executes all the timing, phase and frequency error-correction loops. The output of the corrected signal is passed to the slicer. The slicer makes the soft and hard decisions. For forward error correction (FEC), the system implements a Viterbi algorithm to correct the bit errors in the slicer soft decision output, and generates the final bits [15]. The frame and end of frame are identified. In a burst, the transmitter can transmit several frames. To identify the end of the burst, the transmitter appends a particular sequence in the end of the last frame. If this sequence is detected, the receiver transits back to the SoB state. The state diagram of the sequence of operation in a satellite burst modem receiver is shown in Figure 1.12.
Figure 1.12 Sequence of operations in a satellite burst modem receiver
The algorithm is partitioned to be mapped on different components based on the nature of computations required in implementing the sub-components in the algorithm. The following mapping effectively implements the system.
A DSP is used for mapping computationally intensive and non-regular algorithms in the receiver. These algorithms primarily consist of the demodulator and carrier and timing recovery loops.ASICs are used for ADC, DAC, digital down-conversion (DDC) and digital up-conversion (DUC). A direct digital frequency synthesis (DDFS) chip is used for cosine generation that mixes with the baseband signal.An FPGA implements the glue logic and maps the Viterbi algorithm for FEC. The algorithm is very regular and is effectively mapped in hardware.A microcontroller is used to interface with the control panel and to configure different components in the system.
A block diagram of the system highlighting the target technologies and their interconnection is shown in Figure 1.13.
Figure 1.13 System-level design of a satellite burst receiver
1.8.2 NoC-based SoC for Carrier-class VoIP Media Gateway
VoIP systems connect the legacy voice network with the packet network such that voice, data and associated signaling information are transported on the IP network. In the call setup stage, the signaling protocol (e.g. session initiation protocol, SIP) negotiates parameters for the media session. After the call is successfully initiated, the media session is established. This session takes the uncompressed digitized voice from the PSTN (public switched telephone network) interface and compresses and packages it before it is transported on a packet network. Similarly it takes the incoming packeted data from the IP network and decompresses it before it is sent on the PSTN network. A carrier-class VoIP media gateway processes hundreds of these channels.
The design of an SoC for a carrier-class VoIP media gateway is given in Figure 1.14. A matrix of application-specific processing elements are embedded in an NOC configuration on an SoC. In carrier-class application the SoC processes many channels of VoIP [16]. Each channel of VoIP requires the system to implement a series of algorithms. Once a VoIP call is in progress, the SoC needs to first process ‘line echo cancellation’ (LEC) and ‘dual-tone multi-frequency’ (DTMF) detection on each channel, and then it decompresses the packeted voice and compresses the time-division multiplex (TDM) voice. The SoC has two interfaces, one with the PSTN network and the other with the IP network. The interface with the PSTN may be an H.110 TDM interface. Similarly the interfaces on the IP side may be a combination of POS, UTOPIA or Ethernet. Besides these interfaces, the SoC may also have interfaces for external memory and PCI Express (PCIe). All these components on a chip are connected to a NOC for inter-component communication.
Figure 1.14 NoC-based SoC for carrier-class VoIP applications. Multiple layers of application-specific PEs are attached with an NoC for inter-processor communication
The design assumes that the media gateway controller and packet processor are attached with the media gateway SoC for complete functionality of a VoIP system. The packets received on the IP interface are saved in external memory. The data received on the H.110 interface is buffered in an on-chip memory before being transferred to the external memory. An on-chip RISC microcontroller is intimated to process an initiated call on a specified TDM slot by the host processor on a PCIe interface.
The microcontroller keeps a record of all the live calls, with associated information like the specification on agreed encoder and decoder between caller and callee. The microcontroller then schedules these calls on the array of multiprocessors by periodically assigning all the tasks associated with processing a channel that includes LEC, in-voice DTMF detection, encoding of TDM voice, and decoding of packeted voice. The PEs program external DMA for fetching TDM voice data for compression and packeted voice for decompression. The processor also needs to bring the context from external memory before it starts processing a particular channel. The context has the states of different variables and arrays saved while processing the last frame of data on a particular channel.
The echo is produced at the interface of 4-line to 2-line hybrid at the CO office. Owing to impedance mismatch in the hybrid, the echo of far-end speech is mixed in the near-end voice. This echo needs to be cancelled before the near-end speech is compressed and packetized for transmission on an IP network. An LEC processing element is designed to implement line echo cancellation. The LEC processing also detects double talk and updates the coefficients of the adaptive filter only when line echo is present in the signal and the near end is silent. There is an extended discussion of LEC and its implementation in Chapter 11.
Each processing element in the SoC is scheduled to perform a series of tasks for each channel. These tasks for a particular channel are periodically assigned to a set of PEs. Each PE keeps checking the task list, while it is performing the currently assigned task. Finding a new task in the task list, the PE programs a channel of the DMA to bring data and context for this task into on-chip memory of the processor. Similarly, if the processor finds that it is tasked to perform an algorithm where it also needs to bring the program into its program memory (PM), the PE also requests the DMA to fetch the code for the next task in the PM of the PE. This code fetching is kept to a minimum by carefully scheduling the tasks on the PEs that already have programs of the assigned task in its PM.
1.8.3 Design Flow Migration
As explained earlier, usually the communication system requires component-level integration of different devices to implement digital baseband, RF transmitter and receiver, RF oscillator and power management functionality. The advancement in VLSI technology is now enabling the designer to integrate all these technologies on the same chip.
Although the scope of this book is limited to studying digital systems, it is very pertinent to point out that, owing to cost, performance and power dissipation considerations, the entire system including the analog part is now being integrated on a single chip. This design flow migration is show in Figure 1.15. The ASICs and microcontroller are incorporated as intellectual property (IP) cores and reconfigurable logic (RL) of the FPGAs is also placed on the same chip. Along with digital components, RF and analog components are also integrated on the same chip. For example, a mixed-signal integrated circuit for a mobile communication system usually supports ADC and DAC for on-chip analog-to-digital and digital-to-analog conversion of baseband signals, phase-locked loops (PLLs) for generating clocks for various blocks, and codec components supporting PCM and other standard formats [17]. There are even integrated circuits that incorporate RF and power management blocks on the same chip using deep sub-micron CMOS technology [18].
Figure 1.15 Mixed-signal SoC integrating all components on a multi-chip board on a single chip
References
1. G. E. Moore, “Cramming more components onto integrated circuits,” Electronics, vol. 38, no. 8, April 1965.
2. G. E. Moore, in Plenary Address at ISSCC, 2003.
3. S. Borkar, “Design perspectives on 22-nm CMOS and beyond,” in Proceedings of Design Automation Conference, 2009, ACM/IEEE, pp. 93–94.
4.www.xilinx.com
5.www.altera.com
6.www.atmel.com
7.www.latticesemi.com
8.www.actel.com
9.www.quicklogic.com
10.www.arm.com/products/system-ip/amba/amba-open-specifications.php
11. A. Landry, M. Nekili and Y. Savaria, “A novel 2-GHz multi-layer AMBA high-speed bus interconnect matrix for SoC platforms,” in Proceedings of IEEE International Symposium on Circuits and Systems, 2005, vol. 4, pp. 3343–3346.
12. W. Wolf, A. A. Jerraya and G. Martin, “Multiprocessor system-on-chip (MPSoC) technology,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2008, vol. 27, pp. 1701–1713.
13. S. V. Tota, M. R. Casu, M. R. Roch, L. Macchiarulo and M. Zamboni, “A case study for NoC-based homogeneous MPSoC architectures,” IEEE Transactions on Very Large Scale Integration Systems, 2009, vol. 17, pp. 384–388.
14. R. C. Chang and B.-H. Lim, “Efficient IP routing Table VLSI design for multi-gigabit routers,” IEEE Transactions on Circuits and Systems I, 2004, vol. 51, pp. 700–708.
15. S. A. Khan, M. M. Saqib and S. Ahmed, “Parallel Viterbi algorithm for a VLIW DSP,” in Proceedings of ICASSP, 2000, vol. 6, pp. 3390–3393.
16. M. M. Rahmatullah, S. A. Khan and H. Jamal, “Carrier-class high-density VoIP media gateway using hardware/software distributed architecture,” IEEE Transactions on Consumer Electronics, 2007, vol. 53, pp. 1513–1520.
17. B. Baggini, “Baseband and audio mixed-signal front-end IC for GSM/EDGE applications,” IEEE Journal of Solid-State Circuits, 2006, vol. 41, 1364–1379.
18. M. Hammes, C. Kranz and D. Seippel, “Deep submicron CMOS technology enables system-on-chip for wireless communications ICs,” IEEE Communications Magazine, 2008, vol. 46, pp. 151–161.
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!