123,99 €
This book provides comprehensive coverage of 3D vision systems, from vision models and state-of-the-art algorithms to their hardware architectures for implementation on DSPs, FPGA and ASIC chips, and GPUs. It aims to fill the gaps between computer vision algorithms and real-time digital circuit implementations, especially with Verilog HDL design. The organization of this book is vision and hardware module directed, based on Verilog vision modules, 3D vision modules, parallel vision architectures, and Verilog designs for the stereo matching system with various parallel architectures. * Provides Verilog vision simulators, tailored to the design and testing of general vision chips * Bridges the differences between C/C++ and HDL to encompass both software realization and chip implementation; includes numerous examples that realize vision algorithms and general vision processing in HDL * Unique in providing an organized and complete overview of how a real-time 3D vision system-on-chip can be designed * Focuses on the digital VLSI aspects and implementation of digital signal processing tasks on hardware platforms such as ASICs and FPGAs for 3D vision systems, which have not been comprehensively covered in one single book * Provides a timely view of the pervasive use of vision systems and the challenges of fusing information from different vision modules * Accompanying website includes software and HDL code packages to enhance further learning and develop advanced systems * A solution set and lecture slides are provided on the book's companion website The book is aimed at graduate students and researchers in computer vision and embedded systems, as well as chip and FPGA designers. Senior undergraduate students specializing in VLSI design or computer vision will also find the book to be helpful in understanding advanced applications.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 738
Veröffentlichungsjahr: 2014
Hong Jeong
Pohang University of Science and Technology, South Korea
This edition first published 2014 © 2014 John Wiley & Sons Singapore Pte. Ltd.
Registered officeJohn Wiley & Sons Singapore Pte. Ltd., 1 Fusionopolis Walk, #07-01 Solaris South Tower, Singapore 138628.
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as expressly permitted by law, without either the prior written permission of the Publisher, or authorization through payment of the appropriate photocopy fee to the Copyright Clearance Center. Requests for permission should be addressed to the Publisher, John Wiley & Sons Singapore Pte. Ltd., 1 Fusionopolis Walk, #07-01 Solaris South Tower, Singapore 138628, tel: 65-66438000, fax: 65-66438008, email: [email protected].
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The Publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Jeong, Hong. Architectures for computer vision : from algorithm to chip with Verilog / Hong Jeong. pages cm. Includes bibliographical references and index. ISBN 978-1-118-65918-2 (cloth) 1. Verilog (Computer hardware description language) 2. Computer vision. I. Title. II. Title: From algorithm to chip with Verilog. TK7885.7.J46 2014 621.39–dc23
2014016398
About the Author
Preface
Part One: Verilog HDL
Chapter 1: Introduction
1.1 Computer Architectures for Vision
1.2 Algorithms for Computer Vision
1.3 Computing Devices for Vision
1.4 Design Flow for Vision Architectures
Problems
References
Chapter 2: Verilog HDL, Communication, and Control
2.1 The Verilog System
2.2 Hello, World!
2.3 Modules and Ports
2.4 UUT and TB
2.5 Data Types and Operations
2.6 Assignments
2.7 Structural-Behavioral Design Elements
2.8 Tasks and Functions
2.9 Syntax Summary
2.10 Simulation-Synthesis
2.11 Verilog System Tasks and Functions
2.12 Converting Vision Algorithms into Verilog HDL Codes
2.13 Design Method for Vision Architecture
2.14 Communication by Name Reference
2.15 Synchronous Port Communication
2.16 Asynchronous Port Communication
2.17 Packing and Unpacking
2.18 Module Control
2.19 Procedural Block Control
Problems
References
Chapter 3: Processor, Memory, and Array
3.1 Image Processing System
3.2 Taxonomy of Algorithms and Architectures
3.3 Neighborhood Processor
3.4 BP Processor
3.5 DP Processor
3.6 Forward and Backward Processors
3.7 Frame Buffer and Image Memory
3.8 Multidimensional Array
3.9 Queue
3.10 Stack
3.11 Linear Systolic Array
Problems
References
Chapter 4: Verilog Vision Simulator
4.1 Vision Simulator
4.2 Image Format Conversion
4.3 Line-based Vision Simulator Principle
4.4 LVSIM Top Module
4.5 LVSIM IO System
4.6 LVSIM RAM and Processor
4.7 Frame-based Vision Simulator Principle
4.8 FVSIM Top Module
4.9 FVSIM IO System
4.10 FVSIM RAM and Processor
4.11 OpenCV Interface
Problems
References
Part Two: Vision Principles
Chapter 5: Energy Function
5.1 Discrete Labeling Problem
5.2 MRF Model
5.3 Energy Function
5.4 Energy Function Models
5.5 Free Energy
5.6 Inference Schemes
5.7 Learning Methods
5.8 Structure of the Energy Function
5.9 Basic Energy Functions
Problems
References
Chapter 6: Stereo Vision
6.1 Camera Systems
6.2 Camera Matrices
6.3 Camera Calibration
6.4 Correspondence Geometry
6.5 Camera Geometry
6.6 Scene Geometry
6.7 Rectification
6.8 Appearance Models
6.9 Fundamental Constraints
6.10 Segment Constraints
6.11 Constraints in Discrete Space
6.12 Constraints in Frequency Space
6.13 Basic Energy Functions
Problems
References
Chapter 7: Motion and Vision Modules
7.1 3D Motion
7.2 Direct Motion Estimation
7.3 Structure from Optical Flow
7.4 Factorization Method
7.5 Constraints on the Data Term
7.6 Continuity Equation
7.7 The Prior Term
7.8 Energy Minimization
7.9 Binocular Motion
7.10 Segmentation Prior
7.11 Blur Diameter
7.12 Blur Diameter and Disparity
7.13 Surface Normal and Disparity
7.14 Surface Normal and Blur Diameter
7.15 Links between Vision Modules
Problems
References
Part Three: Vision Architectures
Chapter 8: Relaxation for Energy Minimization
8.1 Euler–Lagrange Equation of the Energy Function
8.2 Discrete Diffusion and Biharminic Operators
8.3 SOR Equation
8.4 Relaxation Equation
8.5 Relaxation Graph
8.6 Relaxation Machine
8.7 Affine Graph
8.8 Fast Relaxation Machine
8.9 State Memory of Fast Relaxation Machine
8.10 Comparison of Relaxation Machines
Problems
References
Chapter 9: Dynamic Programming for Energy Minimization
9.1 DP for Energy Minimization
9.2 N-best Parallel DP
9.3 N-best Serial DP
9.4 Extended DP
9.5 Hidden Markov Model
9.6 Inside-Outside Algorithm
Problems
References
Chapter 10: Belief Propagation and Graph Cuts for Energy Minimization
10.1 Belief in MRF Factor System
10.2 Belief in Pairwise MRF System
10.3 BP in Discrete Space
10.4 BP in Vector Space
10.5 Flow Network for Energy Function
10.6 Swap Move Algorithm
10.7 Expansion Move Algorithm
Problems
References
Part Four: Verilog Design
Chapter 11: Relaxation for Stereo Matching
11.1 Euler–Lagrange Equation
11.2 Discretization and Iteration
11.3 Relaxation Algorithm for Stereo Matching
11.4 Relaxation Machine
11.5 Overall System
11.6 IO Circuit
11.7 Updation Circuit
11.8 Circuit for the Data Term
11.9 Circuit for the Differential
11.10 Circuit for the Neighborhood
11.11 Functions for Saturation Arithmetic
11.12 Functions for Minimum Argument
11.13 Simulation
Problems
References
Chapter 12: Dynamic Programming for Stereo Matching
12.1 Search Space
12.2 Line Processing
12.3 Computational Space
12.4 Energy Equations
12.5 DP Algorithm
12.6 Architecture
12.7 Overall Scheme
12.8 FIFO Buffer
12.9 Reading and Writing
12.10 Initialization
12.11 Forward Pass
12.12 Backward Pass
12.13 Combinational Circuits
12.14 Simulation
Problems
References
Chapter 13: Systolic Array for Stereo Matching
13.1 Search Space
13.2 Systolic Transformation
13.3 Fundamental Systolic Arrays
13.4 Search Spaces of the Fundamental Systolic Arrays
13.5 Systolic Algorithm
13.6 Common Platform of the Circuits
13.7 Forward Backward and Right Left Algorithm
13.8 FBR and FBL Overall Scheme
13.9 FBR and FBL FIFO Buffer
13.10 FBR and FBL Reading and Writing
13.11 FBR and FBL Preprocessing
13.12 FBR and FBL Initialization
13.13 FBR and FBL Forward Pass
13.14 FBR and FBL Backward Pass
13.15 FBR and FBL Simulation
13.16 Backward Backward and Right Left Algorithm
13.17 BBR and BBL Overall Scheme
13.18 BBR and BBL Initialization
13.19 BBR and BBL Forward Pass
13.20 BBR and BBL Backward Pass
13.21 BBR and BBL Simulation
Problems
References
Chapter 14: Belief Propagation for Stereo Matching
14.1 Message Representation
14.2 Window Processing
14.3 BP Machine
14.4 Overall System
14.5 IO Circuit
14.6 Sampling Circuit
14.7 Circuit for the Data Term
14.8 Circuit for the Input Belief Message Matrix
14.9 Circuit for the Output Belief Message Matrix
14.10 Circuit for the Updation of Message Matrix
14.11 Circuit for the Disparity
14.12 Saturation Arithmetic
14.13 Smoothness
14.14 Minimum Argument
14.15 Simulation
Problems
References
Index
End User License Agreement
Chapter 1
Table 1.1
Table 1.2
Chapter 3
Table 3.1
Chapter 4
Table 4.1
Chapter 7
Table 7.1
Table 7.2
Chapter 8
Table 8.1
Chapter 9
Table 9.1
Table 9.2
Table 9.3
Chapter 10
Table 10.1
Table 10.2
Table 10.3
Cover
Table of Contents
Introduction
xi
xiii
xiv
xv
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
131
132
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
303
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
447
448
449
450
Hong Jeong joined the Department of Electrical Engineering at POSTECH in January 1988, after graduating from the Department of EECS at MIT. He has worked at Bell Labs, Murray Hill, New Jersey and has visited the Department of Electrical Engineering at USC. He has taught integrated courses, such as multimedia algorithms, Verilog HDL design, and recognition engineering, in the Department of Electrical Engineering at POSTECH. He is interested in filling in the gaps between computer vision algorithms and VLSI architectures, using GPU and advanced HDL languages.
This book aims to fill in the gaps between computer vision and Verilog HDL design. For this purpose, we have to learn about the four disciplines: Verilog HDL, vision principles, vision architectures, and Verilog design. This area, which we call vision architecture, paves the way from vision algorithm to chip design, and is defined by the related fields, the implementing devices, and the vision hierarchy.
In terms of related fields, vision architecture is a multidisciplinary research area, particularly related to computer vision, computer architecture, and VLSI design. In computer vision, the typical goal of the research is to design serial algorithms, often implemented in high-level programming languages and rarely in dedicated chips. Unlike the well-established design flow from computer architecture to VLSI design, the flow from vision algorithm to computer architecture, and further to VLSI chips, is not well-defined. We overcome this difficulty by delineating the path between vision algorithm and VLSI design.
Vision architecture is implemented on many different devices, such as DSP, GPU, embedded processors, FPGA, and ASICs. Unlike programming software, where the programming paradigm is more or less homogeneous, designing and implementing hardware is highly heterogeneous in that different devices require completely different expertise and design tools. We focus on Verilog HDL, one of the representative languages for designing FPGA/ASICs.
The design of the vision architecture is highly dependent on the context and platform because the computational structures tend to be very different, depending on the areas of study – image processing, intermediate vision algorithms, and high-level vision algorithms – and on the specific algorithms used – graph cuts, belief propagation, relaxation, inference, learning, one-pass algorithm, etc. This book is dedicated to the intermediate vision, where reconstructing 3D information is the major goal.
This book by no means intends to deal with all the diverse topics in vision algorithms, vision architectures, and devices. Moreover, it is not meant to report the best algorithms and architectures for vision modules by way of extensive surveys. Instead, its aim is to present a homogeneous approach to the design from algorithm to architecture via Verilog HDL, that guides the audience in extracting the computational constructs, such as parallelism, iteration, and neighborhood computation, from a given vision algorithm and interpreting them in Verilog HDL terms. It also aims to provide guidance on how to design architectures in Verilog HDL so that the audience may be familiarized enough with vision algorithm and HDL design to proceed to more advanced research. For this purpose, this book provides a Verilog vision simulator that can be used for designing and simulating vision architectures.
This book is written for senior undergraduates, graduate students, and researchers working in computer vision, computer architecture, and VLSI design. The computer vision audience will learn how to convert the vision algorithms to hardware, with the help of the simulator. The computer architecture audience will learn the computational structures of the vision algorithms and the design codes of the major algorithms. The VLSI design audience will learn about the vision algorithms and architectures and possibly improve the codes for their own needs.
This book is organized with four independent parts: Verilog HDL, vision principle, vision architecture, and Verilog design. Each chapter is written to be complete in and of itself, supported by the problem sets and references. The purpose of the first part is to introduce the vision implementation methodology, the Verilog HDL for image processing, and the Verilog HDL simulator for designing the vision architecture. Chapter 1 deals with the taxonomy of the general and specialized algorithms and architectures that are considered typical in vision technology. The pros and cons of the different implementations are discussed, and the dedicated implementation by Verilog HDL design addressed. Chapter 2 introduces the basics of Verilog HDL and coding examples for communication and control modules. These modules are general building blocks for designing vision architectures. Chapter 3 introduces Verilog circuit modules, such as processor, memory, and pipelined array, which are the building blocks of the vision architectures. The vision architectures are designed using processors, memories, and possibly pipelined arrays, connected by the communication and control modules. Chapter 4 introduces the Verilog vision simulators, specially built for designing vision architectures. The simulator consists of the unsynthesizable module, which functions as an interface for image input and output, and the synthesizable module, which is a platform for building serial and parallel architectures. This platform is tailored to the specific architectures in later chapters.
The second part, comprising Chapters 5–7, introduces the fundamentals of intermediate vision algorithms. Instead of treating diverse fields in vision research, this part focuses on the energy minimization, stereo, motion, and fusion of vision modules. Chapter 5 introduces the energy function, which is a common concept in computer vision algorithms. The energy function is explained in terms of Markov random field (MRF) estimation and the free energy concept. The energy minimization methods and the structure of a typical energy function are also explained. Chapter 6 is dedicated to stereo vision. Instead of surveying the extensive research done, this chapter focuses on the constraints and energy minimization. A typical energy function that is subsequently designed with various architectures is discussed. Chapter 7 deals with motion estimation and fusion of vision modules. Instead of an extensive survey, this chapter focuses on the motion principles and the continuity concept that unify the various constraints in motion estimation. This chapter also deals with the fusion of vision modules, directly with intermediate variables, bypassing the 3D variables, which give strong constraints for determining the intermediate vision variables. This chapter closes with a set of equations linking the 2D variables directly, i.e. blur diameter, surface normal, disparity, and optical flow.
The third part, which comprises Chapters 8–10, introduces the algorithms and architectures of the major algorithms: relaxation, dynamic programming (DP), belief propagation (BP), and graph cuts (GC). The computational structures and possible implementations are also discussed. Chapter 8 introduces the concept underlying the relaxation algorithm and architecture. In addition to the Gauss–Seidel and Jacobi algorithms, this chapter introduces other types of architectures: specifically, extensions to the Gauss–Seidel–Jacobi architecture. In Chapter 9, the concept underlying the DP algorithm and architecture is introduced, and the computational structures of various DP algorithms discussed. Finally, the algorithms and architectures of BP and GC are addressed in Chapter 10, and their computational structures and possible implementations discussed.
The fourth part, which comprises Chapters 11–14, is dedicated to the Verilog design of stereo matching with the major architectures: relaxation, DP, and BP. All the designs are provided with complete Verilog HDL codes that have been verified by function simulation and synthesis. Chapter 11 addresses the Verilog design of the relaxation architecture. Chapter 12 deals with the Verilog design of serial architectures for the DP. The design is aimed at executing stereo matching with the serial vision simulator. Chapter 13 introduces the systolic array in Verilog HDL. This chapter explains in detail how to design the control module and the systolic array, connected by local neighborhood connections. Finally, Chapter 14 deals with BP design for stereo matching. This chapter also explains in detail the design methods with Verilog HDL.
All the designs are accompanied by complete source codes that have all been proven correct via simulation and synthesis tests. A package of the codes in the textbook, and the complementary codes, is provided separately for readers. The codes are carefully provided with the general constructs in standard Verilog HDL, which is free from IPs and vendor-dependent codes. I hope that this book will provide an important opportunity that stirs the reader's ability to develop more advanced vision architectures for various vision modules, to deal with the topics that are not dealt within this book because of space constraints, and to fill in the gaps between computer vision and VLSI design.
Much of the work was accomplished during my one year sabbatical leave from POSTECH from September 2012 to August 2013, inclusive. This work was supported by the “Core Technology Development for Breakthrough of Robot Vision Research” and the “Development of Intelligent Traffic Sign Recognition System to cope with Euro NCAP” funded by the Ministry of Trade, Industry & Energy (MI, Korea). During the writing of this book, Altera Corporation provided necessary equipment and tools through the Altera University Program. I would like to thank Michelle Lee at Altera Korea and Bruce Choi at Uniquest, Inc. for helping me to participate in the program. I am also grateful to Peter Lee at Vadas, Inc. for supporting my laboratory financially through projects and Jung Gu Kim at VisionST, Inc. for providing required data and equipment. Some of the programs and bibliography searches were done with help from my students, In Tae Na, Byung Chan Chun, and Jeong Mok Ha. Other students, Jae Young Chun, Seong Yong Cho, and Ki Young Bae, helped with preparation, editing, and proofreading. I sincerely appreciate publisher James Murphy for choosing my writing subject and editor Clarissa Lim for helping me with various pieces of advice and notes. I also remember my colleagues, Prof. Rosalind Picard at MIT Media Lab and Prof. C.-C. Jay Kuo at USC. I thank Professors, Jae S. Lim, Alan V. Oppenheim, Charles E. Leiserson, and Eric Grimson at MIT, Prof. Bernard C. Levy at UC Davis, Prof. Stephen E. Levinson at University of Illinois, and Prof. Jay Kyoon Lee at Syracuse University. Finally, I sincerely appreciate Prof. Bruce R. Musicus at MIT for his generous support and guidance.
Hong Jeong [email protected]
