97,99 €
Acts as single source reference providing readers with an overview of how computer vision can contribute to the different applications in the field of road transportation
This book presents a survey of computer vision techniques related to three key broad problems in the roadway transportation domain: safety, efficiency, and law enforcement. The individual chapters present significant applications within those problem domains, each presented in a tutorial manner, describing the motivation for and benefits of the application, and a description of the state of the art.
Key features:
The book will benefit the many researchers, engineers and practitioners of computer vision, digital imaging, automotive and civil engineering working in intelligent transportation systems. Given the breadth of topics covered, the text will present the reader with new and yet unconceived possibilities for application within their communities.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 809
Veröffentlichungsjahr: 2017
Cover
Title Page
List of Contributors
Preface
Acknowledgments
About the Companion Website
1 Introduction
1.1 Law Enforcement and Security
1.2 Efficiency
1.3 Driver Safety and Comfort
1.4 A Computer Vision Framework for Transportation Applications
References
II: Imaging from the Roadway Infrastructure
2 Automated License Plate Recognition
2.1 Introduction
2.2 Core ALPR Technologies
References
3 Vehicle Classification
3.1 Introduction
3.2 Overview of the Algorithms
3.3 Existing AVC Methods
3.4 LiDAR Imaging‐Based
3.5 Thermal Imaging‐Based
3.6 Shape‐ and Profile‐Based
3.7 Intrinsic Proportion Model
3.8 3D Model‐Based Classification
3.9 SIFT‐Based Classification
3.10 Summary
References
4 Detection of Passenger Compartment Violations
4.1 Introduction
4.2 Sensing within the Passenger Compartment
4.3 Roadside Imaging
References
5 Detection of Moving Violations
5.1 Introduction
5.2 Detection of Speed Violations
5.3 Stop Violations
5.4 Other Violations
References
6 Traffic Flow Analysis
6.1 What is Traffic Flow Analysis?
6.2 The Use of Video Analysis in Intelligent Transportation Systems
6.3 Measuring Traffic Flow from Roadside CCTV Video
6.4 Some Challenges
References
7 Intersection Monitoring Using Computer Vision Techniques for Capacity, Delay, and Safety Analysis
Vision‐Based Intersection Monitoring
7.1 Vision‐Based Intersection Analysis: Capacity, Delay, and Safety
7.2 System Overview
7.3 Count Analysis
7.4 Queue Length Estimation
7.5 Safety Analysis
7.6 Challenging Problems and Perspectives
7.7 Conclusion
References
8 Video‐Based Parking Management
8.1 Introduction
8.2 Overview of Parking Sensors
8.3 Introduction to Vehicle Occupancy Detection Methods
8.4 Monocular Vehicle Detection
8.5 Introduction to Vehicle Detection with 3D Methods
8.6 Stereo Vision Methods
Acknowledgment
References
9 Video Anomaly Detection
9.1 Introduction
9.2 Event Encoding
9.3 Anomaly Detection Models
9.4 Sparse Representation Methods for Robust Video Anomaly Detection
9.5 Conclusion and Future Research
References
IIII: Imaging from and within the Vehicle
10 Pedestrian Detection
10.1 Introduction
10.2 Overview of the Algorithms
10.3 Thermal Imaging
10.4 Background Subtraction Methods
10.5 Polar Coordinate Profile
10.6 Image‐Based Features
10.7 LiDAR Features
10.8 Summary
References
11 Lane Detection and Tracking Problems in Lane Departure Warning Systems
11.1 Introduction
11.2 LD: Algorithms for a Single Frame
11.3 LT Algorithms
11.4 Implementation of an LD and LT Algorithm
References
12 Vision‐Based Integrated Techniques for Collision Avoidance Systems
12.1 Introduction
12.2 Related Work
12.3 Context Definition for Integrated Approach
12.4 ELVIS: Proposed Integrated Approach
12.5 Performance Evaluation
12.6 Concluding Remarks
References
13 Driver Monitoring
13.1 Introduction
13.2 Video Acquisition
13.3 Face Detection and Alignment
13.4 Eye Detection and Analysis
13.5 Head Pose and Gaze Estimation
13.6 Facial Expression Analysis
13.7 Multimodal Sensing and Fusion
13.8 Conclusions and Future Directions
References
14 Traffic Sign Detection and Recognition
14.1 Introduction
14.2 Traffic Signs
14.3 Traffic Sign Recognition
14.4 Traffic Sign Recognition Applications
14.5 Potential Challenges
14.6 Traffic Sign Recognition System Design
14.7 Working Systems
References
15 Road Condition Monitoring
15.1 Introduction
15.2 Measurement Principles
15.3 Sensor Solutions
15.4 Classification and Sensor Fusion
15.5 Field Studies
15.6 Cooperative Road Weather Services
15.7 Discussion and Future Work
References
Index
End User License Agreement
Chapter 01
Table 1.1 Taxonomy of problem domains and applications described in the book.
Chapter 02
Table 2.1 Selection of optimal result given multiple fonts.
Table 2.2 License plate formats for passenger cars across a sample of US states.
Chapter 03
Table 3.1 Comparison of papers according to key features and classification methods for LiDAR AVC systems.
Table 3.2 Comparison of papers according to vision‐ and LiDAR‐based features and classification techniques for vision + LiDAR fusion AVC systems.
Table 3.3 Comparison of selected papers related to shape‐based vehicle classification.
Table 3.4 Vehicle classification test results.
Chapter 05
Table 5.1 Minimum yellow light intervals given approach speeds for straight road and 0 grade.
Chapter 06
Table 6.1 Occupancy of a 12‐m bus.
Table 6.2 Comparison of real‐time challenges in traffic flow analysis for different application domains.
Table 6.3 Hourly counts (and %) by vehicle classification type over a 5‐h period during the middle of the day.
Table 6.4 Lane usage (over a 5 h period) of each vehicle type.
Chapter 07
Table 7.1 Surrogate safety measurements.
Table 7.2 Comparison of sensors.
Chapter 08
Table 8.1 Technologies for the occupancy detection task using nonvisual sensing techniques.
Table 8.2 Detection scores for different LBP configurations, and a comparison with the HOG detector results.
Chapter 09
Table 9.1 Confusion matrices of proposed and state‐of‐the ‐art trajectory‐based methods on CAVIAR data set—single‐object anomaly detection.
Table 9.2 Confusion matrices of proposed and state‐of‐the‐art trajectory‐based methods on the Xerox Stop Sign data set—single‐object anomaly detection.
Table 9.3 Confusion matrices of proposed and state‐of‐the‐art trajectory‐based methods on the Xerox Stop Sign data set—multiple‐object anomaly detection.
Table 9.4 Detection rates of proposed and state‐of‐the‐art trajectory‐based methods on AVSS data set—multiple‐object anomaly detection.
Table 9.5 Confusion matrices of proposed and state‐of‐the‐art trajectory‐based methods on the Xerox Intersection data set—multiple‐object anomaly detection.
Table 9.6 Confusion matrices of PDTV data set.
Table 9.7 Confusion matrices of Stop Sign occluded data set.
Table 9.8 Confusion matrices and execution times of Xerox Stop Sign data set.
Chapter 10
Table 10.1 Fusion‐based method.
Table 10.2 Sensor specifications.
Table 10.3 Intrinsic camera parameters.
Table 10.4 Confusion matrix for results from LiDAR, visual, and fusion methods.
Chapter 11
Table 11.1 Comparisons among edge detectors.
Chapter 12
Table 12.1 Significance of lanes and vehicles in related ADAS applications.
Table 12.2 Comparison between ELVIS and Alvert.
Table 12.3 Comparison of lane position deviation (LPD) metric between ELVIS and LASeR.
Table 12.4 Comparison of computational cost per frame for lane detection.
Chapter 13
Table 13.1 Facial action units used for driver drowsiness analysis by Vural et al.
Chapter 14
Table 14.1 Invariance of colour models to imaging conditions.
Chapter 15
Table 15.1 Success rates from road condition classification using 0.4–0.9 µm camera and RWIS data.
Table 15.2 Road condition differences in and between the wheel tracks at two sites.
Chapter 01
Figure 1.1 Examples of a camera view sensing multiple parking stalls.
Figure 1.2 Illustration of a gantry‐mounted front‐view and road‐side side‐view acquisition setup for monitoring a high‐occupancy lane.
Figure 1.3 Computer vision pipeline for transportation applications.
Chapter 02
Figure 2.1 Block diagram of a typical ALPR processing flow.
Figure 2.2 Original and resulting edge image.
Figure 2.3 Edge projection‐based detection for license plate localization.
Figure 2.4 Results of morphological close operation on edge image.
Figure 2.5 Results of additional morphological filtering operations.
Figure 2.6 Results of applying CCA to reduce candidate blob regions.
Figure 2.7 Extracted sub‐image regions.
Figure 2.8 Cartoon of a coarsely cropped license plate ROI image output from the localization module.
Figure 2.9 Extracting a tightly cropped ROI image around the characters.
Figure 2.10 Original and binary images for a coarsely cropped ROI around a license plate.
Figure 2.11 Remaining foreground blobs after morphological filtering and CCA.
Figure 2.12 Remaining foreground blobs after pruning based on centerline fit method of candidate character blobs.
Figure 2.13 Vertically cropped ROI image for a partially redacted license plate.
Figure 2.14 A simple gradient example.
Figure 2.15 An example of a more complex vehicle background.
Figure 2.16 Original image and detection scores for valid character detector.
Figure 2.17 Segmentation cut points as cropping boundaries between characters.
Figure 2.18 Example of vertical projection approach for character segmentation.
Figure 2.19 Difficult example for projective segmentation.
Figure 2.20 Typical OCR training workflow.
Figure 2.21 Distorted versions of real characters (original is first).
Figure 2.22 Synthetically generated license plates.
Figure 2.23 OCR performance for three sets of training data: (i) 700 real SPC (cross‐marks), 2000 synthetic SPC (dashed curve), and (ii) a mix of 2000 synthetic and 100 real SPC (circle marks).
Figure 2.24 Example of SMQT robustness to gain and offset distortions.
Figure 2.25 Extracted HOG features for license plate characters.
Figure 2.26 One‐vs‐all (left) and one‐vs‐one (right) classifier architecture.
Figure 2.27 Typical OCR evaluation workflow.
Figure 2.28 OCR with multiple trained fonts.
Figure 2.29 Partially redacted plate image showing a clearly visible state name.
Figure 2.30 Partially redacted plate image illustrating the challenges of low contrast and blur in the state name.
Figure 2.31 Partially redacted plate images with occlusion of the state name by license plate frames and car body features.
Figure 2.32 Comparison of fonts for New York and California license plate characters: “A” and “4.”
Chapter 03
Figure 3.1 Overview of vehicle classification algorithms.
Figure 3.2 Visualization of a vehicle profile generated by a system using data from radar/light curtains.
Figure 3.3 Typical steps seen in AVC systems using LiDAR sensors.
Figure 3.4 (a) With airborne LiDAR, the infrared laser light is emitted toward the ground and is reflected back to the moving airborne LiDAR sensor. (b) Terrestrial LiDAR collects very dense and highly accurate points, facilitating more precise identification of objects. These dense point clouds can be used to model the profile of a vehicle.
Figure 3.5 Simplified diagram for AVC that fuses vision and LiDAR sensors.
Figure 3.6 Different viewpoints of the thermal camera: (a) side view, (b) overpass view [11], and (c) UAV view [12].
Figure 3.7 Example frames from a thermal camera for different vehicles: (a) small car, (b) SUV, (c) pickup truck, and (d) bus.
Figure 3.8 Vehicle temperature regions in thermal imagery.
Figure 3.9 HOG features extracted from a single region containing a car.
Figure 3.10 Heat signature extracted from image of a car.
Figure 3.11 Thermal image [12] with a pixelated car showing Haar features.
Figure 3.12 Motion field calculated for two vehicles with different velocities in two consecutive frames, indicating the cut line used later to divide the merged blob.
Figure 3.13 Bounding box, minimal area bounding box (rotated), and fitted ellipse of a blob.
Figure 3.14 A blob (a) and its convex hull (b).
Figure 3.15 Simplified diagram explaining hierarchical classification.
Figure 3.16 Visualization of rear car features.
Figure 3.17 Transform‐semi‐ring‐projection used to describe one 2D blob image (a) into eight 1D vectors (b).
Figure 3.18 Border edge of vehicle shape used for scanning to create a sequence of segments.
Figure 3.19 Vehicles and detected edges.
Figure 3.20 HOG features extracted from a gray‐scale image of a vehicle.
Figure 3.21 Haar‐like features used in vehicle detection: edge features, center features, and line features. .
Figure 3.22 Image with pixelated car manifesting distinct Haar features.
Figure 3.23 Calculated mean of normalized images.
Figure 3.24 (a) Examples of vehicle pictures used to calculate eigenvehicles of sedan class, (b) first eight principal components—eigenvehicles of sedan class.
Figure 3.25 Screenshots for the classification of trucks (large boxes) and cars (small boxes) of the live videos from three cameras in different angles.
Chapter 04
Figure 4.1 Illustration of a gantry‐mounted front‐view and road‐side side‐view acquisition setup.
Figure 4.2 NIR images acquired by HOV/HOT lane front‐ and side‐view cameras: (a) front‐view images and (b) side‐view images.
Figure 4.3 High‐level view of detecting passenger compartment violations images in a classification‐based approach
Figure 4.4 Windshield model learned using DPM‐based method presented in Ref. [40]. (a) Training landmarks (light gray points) overlaid on top of the captured image and (b) Windshield model learned in training.
Figure 4.5 Side‐window models learned using SVM‐based method presented in Ref. [40]. (a) Training landmarks (light gray points) overlaid on top of the side‐view images corresponding to five mixtures and (b) five mixture models learned in the training.
Figure 4.6 Parts (a–c) show a set of landmark points overlaid on three images for B‐pillar detection. Part (d) illustrates the spatial shape model for tree structure learned in training.
Figure 4.7 The structure of cascade classifier.
Figure 4.8 Regions of interest (ROI) defined for each case where white dashed rectangle shows ROI for front seat occupancy detection, solid rectangle for driver cell phone usage detection, and dotted rectangle for seat belt violation detection.
Figure 4.9 Examples for driver face images with partial occlusions.
Figure 4.10 Edge profiles associated with seats and headrests.
Chapter 05
Figure 5.1 Schematic illustrations of cross‐correlation and motion‐blob proximity association tracking methods.
Figure 5.2 Illustration example for
model‐based camera calibration method
: (a) left‐side view of the scene and (b) top view of the scene. .
Figure 5.3 Illustration of an accuracy issue related to tracked vehicle image feature height and the dimensionality of image acquisition.
Figure 5.4 Speed estimation process in a stereo vision system.
Figure 5.5 Depth capable system comprising two cameras.
Figure 5.6 Foreground detection based on augmented background modeling.
Figure 5.7 Point tracking in the stereo domain.
Figure 5.8 Red light enforcement system as described in Ref. [43]. .
Figure 5.9 High‐level overview of the red light violation detection system proposed in Ref. [43] from evidentiary photos. .
Figure 5.10 Feature matching in evidentiary red light images from Ref. [43]. Matched SURF features in Shot A and Shot B images (a), and coherent cluster of matched features after eliminating feature pairs irrelevant to red light violation (b). .
Figure 5.11 Block diagram for the video‐based red light violation detection system based on Ref. [48]. .
Figure 5.12 Three situations in the cross junction. The VLDs cannot distinguish right turns (situation 3) from the red light violations (situations 1 and 2). .
Figure 5.13 Flowchart of the tracking algorithm proposed in Ref. [49].
Chapter 06
Figure 6.1 Time–space traffic diagram.
Figure 6.2 Flow concentration function.
Figure 6.3 Typical/general framework for traffic flow analysis.
Figure 6.4 ROI/feature extraction.
Figure 6.5 System’s data flow diagram.
Figure 6.6 Vehicle segmentation result: (a) background subtraction results: modeled background (black), foreground object (white), shadow (light gray), or reflection highlights (dark gray); and (b) foreground image (morphologically dilated), detection zone (the dark lines are lane separators, while the dashed black line designates the bus stop area), and the vehicle detection lines (dark, gray, and light gray).
Figure 6.7 Kalman filter track labeling results.
Figure 6.8 (a) Google Earth image with manually labeled locations, (b) calibration reference image (middle), and (c) zoomed portion of (b). Circles and index numbers indicate the corresponding points and the asterisks indicate the reprojected points. The average reprojection error was 0.97 pixels for this view.
Figure 6.9 An input image and the shape spatial pyramid representation of HOG for a bus over three spatial scales: 9 features (level 0), 36 features (level 1), and 144 features (level 2).
Figure 6.10 3D wireframe models with true size (a) and samples of their projections on the ground plane (b).
Figure 6.11 Alternative point‐based representation of a vehicle’s location.
Figure 6.12 Semantic lane labeling, camera 2: (a) shows initial trajectories clustering results; (b) presents the refined results obtained using random sample consensus (RANSAC), the black “+” shows the fitted center line of each traffic lane; and (c) segmented and labeled lanes, boundaries, and the direction of traffic flow (arrows) derived from tracking results.
Figure 6.13 Plot of vehicle densities for each vehicle type sampled in 10‐min periods.
Figure 6.14 Plot of vehicle speed distributions for each vehicle type.
Figure 6.15 Illustration of some challenges in the analysis of traffic flow in developing countries (Pakistan).
Chapter 07
Figure 7.1 Example video frames from intersections which highlight the complexity of mixed participants and important events that occur at intersections: (a) group of pedestrians and (b) two jaywalkers.
Figure 7.2 Intersection monitoring system overview.
Figure 7.3 An example of Harris corners detected for an intersection scenario.
Figure 7.4 Tracking features of two frames using the KLT algorithm.
Figure 7.5 Four‐point correspondence between camera image plane and map‐aligned satellite image to estimate the homography (
H
) matrix and convert image locations to world latitude and longitude.
Figure 7.6 Scene preparation process for TM count: (a) zone definition and (b) models of typical paths.
Figure 7.7 Complete crossing event [23] defined by traversal between waiting zones (trapezoids): (a) pedestrian crossing and (b) pedestrian data recorded.
Figure 7.8 Crowd counting system [24].
Figure 7.9 Detection‐based queue length estimation uses nonmoving keypoints to estimate a queue line [28]. (a) Static corner points and (b) estimated queue length.
Figure 7.10 The queue length estimation (lines), detected stopped vehicles (bounding boxes), and feature points (points). Accumulated vehicle count regarding each lane and the number of waiting vehicles in the queue are displayed for each lane.
Figure 7.11 Tracking‐based queue length estimation: (a) estimated queue length and (b) waiting vehicles in queue.
Figure 7.12 Trajectory analysis for left‐turn prediction at intersection showing the probability of the top three best paths [34].
Figure 7.13 Crossing violation examples [36]. A spatial violation occurs when a pedestrian does not use the crosswalk. A temporal violation occurs when pedestrians cross the intersection during a “red‐hand” phase. (a) Spatial violation and (b) temporal violation.
Figure 7.14 Waiting‐time distribution and snapshot of tracking system which indicate a waiting pedestrian who talks with his phone for a long time period: (a) heatmap of waiting pedestrians and (b) snapshot of tracking system.
Figure 7.15 The process of collision inference from 3D pose tracking and prediction of impending collision [41]. (a) Pose refinement process and (b) collision judgment.
Figure 7.16 Traffic safety pyramid measurement showing hierarchy of traffic events (F, fatal; I, injury) [44].
Figure 7.17 Vehicle–vehicle conflict frequency heatmap (conflicts/
m
2
) providing spatial danger assessment at Burrard and Pacific Intersection [46].
Figure 7.18 Two hazardous gap estimation scenarios. (a) Driver (SV) intends to make U‐turn on a flashing yellow, but waiting vehicles mask the driver view of a moving vehicle. (b) Parked cars (i.e., waiting vehicles) occlude the driver view (SV) when attempting to make a left turn at a junction.
Chapter 08
Figure 8.1 Example of a smart phone application developed for use by delivery trucks in the City of Vienna, Austria. Bright‐colored truck icons indicate free loading zones, dark icons indicate occupied zones, and mid‐gray color means “not in use.”
Figure 8.2 Examples for two parking area configurations. Ideally a vision‐based system needs only be reconfigured for those two parking area types: (a) a large parking space area (b) parking spaces along the road.
Figure 8.3 The working of a combined color/interest point‐based feature detector and classifier. In an off‐line processing stage, the parking area is divided into single parking spaces, which are stored in the system as ROIs. Processing as shown in the figure takes place on each of the individual parking spaces.
Figure 8.4 The computation of HOG features on a sample location. Cells of size
C × C
pixels (
C
= 6, 8, typically) are combined into blocks; overlapping blocks span the whole sample window. The concatenated block histograms, after normalization, form the HOG detector feature vector.
Figure 8.5 The processing pipeline of the HOG vehicle detector as used by Bulan et al. Raw camera images are rectified first, on an ROI which has been precomputed using background subtraction for change detection. A sliding detector window is then run with the pretrained HOG detector.
Figure 8.6 Creation of the decimal encoding of an LBP.
Figure 8.7 The loading zones which had to be monitored for the pilot project in the City of Vienna. This is a nighttime image of the two empty parking spaces. Harsh lighting situations, reflections, and adjacent heavy traffic complicate the detection task.
Figure 8.8 Typical detection results using the combined HOG–LBP detector. The disadvantage of site‐ and camera‐specific training is offset by the excellent detection rate and very low false‐positive rate of this approach. (a) Presence of pedestrians does not produce false alarms. (b) Cars’ positive detections.
Figure 8.9 Data processing in a fully connected NN with input, intermediate, and output layer.
Figure 8.10 CNN configuration with input layer, two convolutional layers, fully connected layer, and output layer. The input layer is a normalized grayscale input image. Each of the convolutional layers consists of several feature maps; the output layer represents the classification results.
Figure 8.11 The volume modeling method from Delibaltov: (a) an example image of a parking space, (b) the parking layout from manual setup, (c) region of 2D parking spaces, and (d) regions of 3D parking space volume.
Figure 8.12 Theoretical accuracies for several practical camera configurations.
Figure 8.13 Computation of the back‐matching distance using the disparity images disp LR and disp RL. The back‐matching distance is the distance between points
P
l
and
.
Figure 8.14 The data flow of the 3D vehicle detection pipeline.
Figure 8.15 Stereo matching for vehicle detection. This image sequence has been generated from a system test in the City of Graz prior to installation. (a) One camera image of the stereo rig, resulting stereo disparities are depicted in (b), and the resulting interpolated 3D point cloud is shown in (c). For vehicle detection, the volume of the vehicle (shown as cube in (c)) is approximated and its boundaries are tested against a virtual fence, which is shown as a vertical plane left of the cube in (c).
Figure 8.16 The result of 3D reconstruction by Cook [49]. (a) Empty parking space with colored markers. (b) Reconstruction example—scenario with four vehicles.
Chapter 09
Figure 9.1 Flow diagram of video anomaly detection.
Figure 9.2 Taxonomy of anomaly detection scenarios.
Figure 9.3 Object trajectory generation for anomaly detection.
Figure 9.4 Hidden Markov model characterized by initial state probability
π
, state transition matrix
A
, and state output matrix
B
.
Figure 9.5 Contextual anomaly detection model: (a) input video captured at busy traffic scene, (b) motion label
L
t
, (c) background behavior image
B
m
ax
(
x
), and (d) anomaly detected via behavior subtraction.
Figure 9.6 An example illustration of sparsity‐based trajectory classification.
Figure 9.7 An example illustration of anomalous event versus normal event.
Figure 9.8 (a) Structured scenario and (b) unstructured scenario.
Figure 9.9 Example frames of single‐object anomalies: (a) a man suddenly falls on floor—from the CAVIAR data set and (b) a driver backs his car in front of stop sign—from the Xerox Stop Sign data set.
Figure 9.10 Example frames of multiple‐object anomalies: (a) a vehicle almost hits a pedestrian—from the AVSS data set, (b) a car violates the stop sign rule—from the Xerox Stop Sign data set, and (c) a car fails to yield to oncoming car while turning left—from the Xerox Intersection data set.
Figure 9.11 (a and b) Example anomaly in PDTV data, (c and d) example anomaly in Xerox Stop Sign data set, and (e and f) example frames that show object occlusion.
Figure 9.12 Detection rates curves with respect to the value of
λ
.
Chapter 10
Figure 10.1 Overview of the pedestrian detection algorithms as described in Ref. [2].
Figure 10.2 Background subtraction: (a) original video, (b) background, (c) filtered foreground, and (d) foreground. We can see that the extracted foreground figure has shadow attached because the shadow is also a moving object.
Figure 10.3 Pixels of a video sequence processed for background/foreground detection, capturing changes occurring due to the passing man.
Figure 10.4 An example of the morphological operations to fill the holes in the object for better contour extraction.
Figure 10.5 A 2D shape and its polar coordinate profile (
r
,
θ
) plot. Here, min
r
values are chosen.
Figure 10.6 Overview of the fusion‐based pedestrian‐based methods.
Figure 10.7 Overview of the algorithm flow followed in our implementation.
Figure 10.8 Stages for LiDAR module.
Figure 10.9 Cartesian and polar coordinate representation of a point
p
i
.
Figure 10.10 (a) LiDAR range points in Cartesian plane, all layers grouped together (
x‐
and
y
‐axis as defined in Figure 10.9); (b) the image corresponding to the LiDAR scan (
x
‐ and
y
‐axis corresponding to image plane); (c) range points outside a threshold (FOV) in
x‐
and
y
‐axis are discarded; (d) segmentation using the jump distance clustering method; (e) number of data points per segment; (f) the length‐to‐width ratio calculation of each segment; (g) segment labels; (h) ROI in LiDAR plane; (i) corresponding ROI in image plane; and (j) bounding box displaying the ROI in the image.
Figure 10.11 Stages for vision module.
Figure 10.12 Samples of the training data: (a) positive training images and (b) negative training images.
Figure 10.13 An illustration of segment labeling and its corresponding image plane objects. Here, NP, nonpedestrian segment; PL, pedestrian‐like segment.
Figure 10.14 (a) The original frame, (b) cropped FOV, and (c) sliding window in the FOV of a predefined size.
Figure 10.15 Examples of HOG–SVM‐based vision module results for pedestrian detection. Images at the top are the input images and the images at the bottom are the windows identified as containing pedestrians.
Figure 10.16 Examples of fusion scheme–based pedestrian detection. Along with detection, we have also measured the distance for impact.
Chapter 11
Figure 11.1 Lane departure warning systems.
Figure 11.2 LDWS overview.
Figure 11.3 LDWS algorithm structure.
Figure 11.4 Lane detection.
Figure 11.5 First and second derivates around an edge.
Figure 11.6 Original image.
Figure 11.7 Output from Sobel operator.
Figure 11.8 Output from Laplacian of Gaussian operator.
Figure 11.9 Output from Canny operator.
Figure 11.10 FOM: ideal image: (a) original image and (b) ideal image.
Figure 11.11 Stripes identification via edge distribution function.
Figure 11.12 Edge distribution function.
Figure 11.13 Hough transform: (a) image plane—parameters plane transformation; (b) single‐point transformation; (c) three‐point transformation; and (d) straight‐line transformation.
Figure 11.14 Straight‐line identification via Hough transform.
Figure 11.15 Linear‐parabolic fitting: (a) Near‐ and far‐field image separation and (b) lane boundary region of interest (LBROI).
Figure 11.16 Algorithm implementation for lane detection and tracking.
Figure 11.17 Driving scenario: (a) vehicle position, (b) longitudinal speed, (c) lateral speed, (d) longitudinal acceleration, and (e) lateral acceleration.
Figure 11.18 Orientation of the right and left lane stripes.
Chapter 12
Figure 12.1 Vision‐based driver assistance systems ().
Figure 12.2 Lane and vehicle detection techniques when seen in isolation.
Figure 12.3 (a) Sliding window of multiple sizes used across the image and (b) relationship between sliding window and the position of the leading vehicles.
Figure 12.4 Generating the LUT of windows and positions using lane and IPM information.
Figure 12.5 Two‐part‐based vehicle detection method in ELVIS.
Figure 12.6 Improving lane feature extraction using vehicle positions.
Figure 12.7 ROCs for detecting fully visible vehicles using the proposed two‐part‐based method.
Figure 12.8 Sample results of vehicle detection in ELVIS.
Figure 12.9 Comparison of computation costs in terms of number of windows (in log scale) on the
y
‐axis and the following different techniques on the
x
‐axis: (1) sliding window approach such as [7] bound by
y
max
and
y
min
, (2) sliding window approach on the entire image [7], (3) proposed method considering both parts of all windows are computed, and (4) proposed method considering both parts are computed for 10% of the total windows.
Figure 12.10 The first column shows lane detection on a sequence using the conventional LASeR algorithm. The second column shows the results of lane detection using the proposed integrated approach.
Chapter 13
Figure 13.1 Schematic of a driver monitoring system utilizing cameras, sensors, and vehicle telematics.
Figure 13.2 Haar features used in the Viola–Jones face detector. The sum of pixel values within white rectangles is subtracted from the sum of pixels within gray rectangles: (a) horizontal two‐rectangle feature, (b) vertical two‐rectangle feature, (c) three‐rectangle feature, and (d) four‐rectangle feature.
Figure 13.3 Schematic illustration of AdaBoost. Three weak classifiers
h
1
(⋅),
h
2
(⋅), and
h
3
(⋅) are trained on 2D training samples
x
i
, where + and −denote positive and negative classes, respectively. The first classifier operates only on the vertical dimension, while the second and third classifiers operate on the horizontal dimension. The final strong classifier formed as a weighted combination of the weak classifiers produces a more complex decision boundary, accurately separating the two classes. A desirable property of AdaBoost is that it is robust to overfitting even in the case when the input features are of very high dimensionality and the number of training samples is small.
Figure 13.4 Block diagram of driver eye detection using dual active IR illumination.
Figure 13.5 Embedding of image poses in a manifold in two‐dimensional feature space.
Figure 13.6 Fine‐grain gaze locations supported by the algorithm in Ref. [8]. The fine estimate operates only on frontal poses from the first stage and classifies gaze into one of eight directions that are prevalent in a driving task. As shown in Figure 13.6, the selected directions are left rear mirror (1), road (2), dashboard (3), traffic sign (4), top mirror (5), phone/texting (6), music console (7), and right rear mirror (8).
Figure 13.7 Flow diagram of the algorithm in Ref. [8].
Figure 13.8 Features used for fine‐grain gaze classification by the algorithm in Ref. [8].
Figure 13.9 Generalizability of different classifiers proposed in Ref. [8].
Figure 13.10 Examples of action units in the Facial Action Coding System.
Figure 13.11 Hierarchy of data fusion at various levels of signal abstraction, described by Shivappa et al.
Chapter 14
Figure 14.1 The Swedish traffic signs. (a) Warning signs, (b) prohibitory signs, (c) mandatory signs and (d–f) indicatory and supplementary signs.
Figure 14.2 Potential challenges when working with traffic signs. (a) Faded sign, (b) bad weather condition, (c) bad lighting geometry, (d) obstacles in the scene, (e) similar background colour, (f) damaged sign, (g) distance related size, (h) motion blur, (i) reflection from sign board and (j) stickers.
Figure 14.3 Different countries have different colour standard. (a) The Netherlands and (b) Sweden.
Figure 14.4 Traffic sign pictograms are different in different countries. (a) Austria and (b) Sweden.
Figure 14.5 Block diagram of traffic sign recognition system.
Figure 14.6 Traffic sign tree based on colour and shape information.
Figure 14.7 Effect of aging. (a) A new traffic sign. (b) An old traffic sign.
Figure 14.8 Hue–Saturation plot of new and old traffic signs.
Figure 14.9 Hue–Saturation of traffic sign colours of different countries.
Figure 14.10 Traffic sign reflection model.
Figure 14.11 Dynamic search boundary by the proposed algorithm for the red colour.
Figure 14.12 Colour segmentation results of traffic sign images in different light conditions.
Figure 14.13 Traffic sign images from different countries. (Top left to bottom right) Austria, France, Germany, Japan, The Netherlands, and Poland.
Figure 14.14 Circular Hough transform voting mechanism.
Figure 14.15 Property curves of a circle (a) and a triangle (b).
Figure 14.16 Smoothed property curves by LOWESS regression.
Figure 14.17 Results of rim detection.
Figure 14.18 Binary Haar features of different traffic sign rims.
Figure 14.19 Steps of pictogram extraction.
Figure 14.20 Computing the HOG descriptors of a pictogram.
Figure 14.21 The training dataset includes non‐traffic sign objects and pictograms.
Figure 14.22 Classification errors versus number of features.
Figure 14.23 Classification rate of the Gentle AdaBoost trained with vertically aligned signs versus angle of rotation of traffic signs.
Figure 14.24 Classification rate of the Gentle AdaBoost trained with rotated signs versus angle of rotation of traffic signs.
Figure 14.25 Training and testing time of the classifier with HOG descriptors.
Figure 14.26 Correctly classified speed limit traffic signs.
Chapter 15
Figure 15.1 IR light source and stereo camera inside housing.
Figure 15.2 The camera system software algorithm for road friction estimation.
Figure 15.3 User interface of road condition monitoring system.
Figure 15.4 A mechanical representation of an oscillator in an isotropic material where the negatively charged shell is fastened to a stationary positive nucleus by identical springs.
Figure 15.5 (a) Reflectance spectrum for dry asphalt and asphalt covered with water, ice, and snow. (b) Reflectance spectrum for five different depths of water on asphalt.
Figure 15.6 Basic measuring principle of fog detection system with a laser scanner.
Figure 15.7 Targets as seen in various fog densities in the fog chamber.
Figure 15.8 Algorithm description of the fog detection system.
Figure 15.9 Average visibility distance measured by the laser scanner in meters. Two areas are marked during which there was dense snowfall.
Figure 15.10 IcOR’s stereo camera located inside the test vehicle.
Figure 15.11 Road eye and IcOR measuring principles.
Figure 15.12 Road condition classifications for the Road eye and a benchmarking sensor for different road conditions.
Figure 15.13 An example plot from a PCA of road condition data for different road conditions. The different road conditions can be seen to be grouped, and by looking at more of the principal components the groupings for some of the road conditions can be seen more clearly.
Figure 15.14 A method for developing road conditions classifiers. In the laboratory, a dataset is used to train and validate the classifiers. In the field, where the environment conditions are more varying, two datasets are used. One dataset is used for training and validation of the classifiers and another dataset is used to test the dataset. The test dataset from field experiments shows the performance of the classifiers.
Figure 15.15 Sensor data fusion module for merging in‐vehicle inertia and optical measurement data.
Figure 15.16 Architecture of the infrastructure subsystem.
Figure 15.17 Vaisala’s road weather sensors on the roof of the test vehicle.
Figure 15.18 Polarization difference.
Figure 15.19 Graininess levels.
Cover
Table of Contents
Begin Reading
iii
iv
xiii
xiv
xv
xvii
xviii
xix
xxi
xxiii
1
2
3
4
5
6
7
8
9
10
11
12
13
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
399
400
401
402
403
404
Edited by
Robert P. Loce
Conduent Labs, Webster, NY, USA
Raja Bala
Samsung Research America, Richardson, TX, USA
Mohan Trivedi
University of California, San Diego, CA, USA
This edition first published 2017© 2017 John Wiley & Sons Ltd
Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought
Library of Congress Cataloging‐in‐Publication data applied for
ISBN: 9781118971604
A catalogue record for this book is available from the British Library.
Cover design: WileyCover image: ©Anouar Akrouh / Eyeem / Gettyimages
Raja BalaSamsung Research AmericaRichardson, TXUSAEdgar A. BernalUnited Technologies Research CenterEast Hartford, CTUSAOrhan BulanGeneral Motors Technical CenterWarren, MIUSAAaron BurryConduent LabsWebster, NYUSAYang CaiCarnegie Mellon UniversityPittsburgh, PAUSAGianni CarioDepartment of Informatics, Modeling, Electronics and System Engineering (DIMES)University of CalabriaRendeItalyAlessandro CasavolaDepartment of Informatics, Modeling, Electronics, and System Engineering (DIMES)University of CalabriaRendeItalyJohan CasselgrenLuleå University of TechnologyLuleåSwedenZezhi ChenKingston UniversityLondonUKShashank DeshpandeCarnegie Mellon UniversityPittsburgh, PAUSATimothy J. EllisKingston UniversityLondonUKRodrigo FernandezUniversidad de los AndesSantiagoChileHasan FleyehDalarna UniversityFalunSwedenPatrik JonssonCombitech ABÖstersundSwedenVladimir KozitskyConduent LabsWebster, NYUSAMatti KutilaVTT Technical Research Centre of Finland Ltd.TampereFinlandYuriy LipetskiSLR Engineering GmbHGrazAustriaRobert P. LoceConduent LabsWebster, NYUSAMarco LupiaDepartment of Informatics, Modeling, Electronics and System Engineering (DIMES)University of CalabriaRendeItalyVishal MongaPennsylvania State UniversityUniversity Park, PAUSABrendan Tran MorrisUniversity of NevadaLas Vegas, NVUSAWiktor MuronCarnegie Mellon UniversityPittsburgh, PAUSAPeter PaulConduent LabsWebster, NYUSAPasi PyykönenVTT Technical Research Centre of Finland Ltd.TampereFinlandRavi SatzodaUniversity of CaliforniaSan Diego, CAUSAMohammad Shokrolah ShiraziCleveland State UniversityCleveland, OHUSAOliver SidlaSLR Engineering GmbHGrazAustriaMohan TrivediUniversity of CaliforniaSan Diego, CAUSASergio A. VelastinUniversidad Carlos III de MadridMadridSpainWencheng WuConduent LabsWebster, NYUSABeilei XuConduent LabsWebster, NYUSAMuhammad Haroon YousafUniversity of Engineering and TechnologyTaxilaPakistan
There is a worldwide effort to develop smart transportation networks that can provide travelers with enhanced safety and comfort, reduced travel time and cost, energy savings, and effective traffic law enforcement. Computer vision and imaging is playing a pivotal role in this transportation evolution. The forefront of this technological evolution can be seen in the growth of scientific publications and conferences produced through substantial university and corporate research laboratory projects. The editors of this book have assembled topics and authors that are representative of the core technologies coming out of these research projects. This text offers the reader with a broad comprehensive exposition of computer vision technologies addressing important roadway transportation problems. Each chapter is authored by world‐renowned authorities discussing a specific transportation application, the practical challenges involved, a broad survey of state‐of‐the‐art approaches, an in‐depth treatment of a few exemplary techniques, and a perspective on future directions. The material is presented in a lucid tutorial style, balancing fundamental theoretical concepts and pragmatic real‐world considerations. Each chapter ends with an abundant collection of references for the reader requiring additional depth.
The book is intended to benefit researchers, engineers, and practitioners of computer vision, digital imaging, automotive, and civil engineering working on intelligent transportation systems. Urban planners, government agencies, and other decision‐ and policy‐making bodies will also benefit from an enhanced awareness of the opportunities and challenges afforded by computer vision in the transportation domain. While each chapter provides the requisite background required to learn a given problem and application, it is helpful for the reader to have some familiarity with the fundamental concepts in image processing and computer vision. For those who are entirely new to this field, appropriate background reading is recommended in Chapter 1. It is hoped that the material presented in the book will not only enhance the reader’s knowledge of today’s state of the art but also prompt new and yet‐unconceived applications and solutions for transportation networks of the future.
The text is organized into Chapter 1 that provides a brief overview of the field and Chapters 2–15 divided into two parts. In Part I, Chapters 2–9 present applications relying upon the infrastructure, that is, cameras that are installed on roadway structures such as bridges, poles and gantries. In Part II, Chapters 10–15 discuss techniques to monitor driver and vehicle behavior from cameras and sensors placed within the vehicle.
In Chapter 2, Burry and Kozitsky present the problem of license plate recognition—a fundamental technology that underpins many transportation applications, notably ones pertaining to law enforcement. The basic computer vision pipeline and state‐of‐the‐art solutions for plate recognition are described. Muron, Deshpande, and Cai present automatic vehicle classification (AVC) in Chapter 3. AVC is a method for automatically categorizing types of motor vehicles based on the predominant characteristics of their features such as length, height, axle count, existence of a trailer, and specific contours. AVC is also an important part of intelligent transportation system (ITS) in applications such as automatic toll collection, management of traffic density, and estimation of road usage and wear.
Chapters 2, 4, 5, and 8 present aspects of law enforcement based on imaging from the infrastructure. Detection of passenger compartment violations is presented in Chapter 4 by Bulan, Xu, Loce, and Paul. The chapter presents imaging systems capable of gathering passenger compartment images and computer vision methods of extracting the desired information from the images. The applications it presents are detection of seat belt usage, detection of mobile phone usage, and occupancy detection for high‐occupancy lane tolling and violation detection. The chapter also covers several approaches, while providing depth on a classification‐based method that is yielding very good results. Detection of moving violations is presented in Chapter 5 by Wu, Bulan, Bernal, and Loce. Two prominent applications—speed detection and stop light/sign enforcement—are covered in detail, while several other violations are briefly reviewed.
A major concern for urban planners is traffic flow analysis and optimization. In Chapter 6, Fernandez, Yousaf, Ellis, Chen, and Velastin present a model for traffic flow from a transportation engineer’s perspective. They consider flow analysis using computer vision techniques, with emphasis given to specific challenges encountered in developing countries. Intersection modeling is taught by Morris and Shirazi in Chapter 7 for the applications of understanding capacity, delay, and safety. Intersections are planned conflict points with complex interactions of vehicles, pedestrians, and bicycles. Vision‐based sensing and computer vision analysis bring a level of depth of understanding that other sensing modalities alone cannot provide. In Chapter 8, Sidla and Lipetski examine the state of the art in visual parking space monitoring. The task of the automatic parking management is becoming increasingly essential. The number of annually produced cars has grown by 55% in the past 7 years. Most large cities have a problem of insufficient availability of parking space. Automatic determination of available parking space coupled with a communication network holds great promise in alleviating this urban burden.
While computer vision algorithms can be trained to recognize common patterns in traffic, vehicle, and pedestrian behavior, it is often an unusual event such as an accident or traffic violation that warrants special attention and action. Chapter 9 by Bala and Monga is devoted to the problem of detecting anomalous traffic events from video. A broad survey of state‐of‐the‐art anomaly detection models is followed by an in‐depth treatment of a robust method based on sparse signal representations.
In Part II of the text, attention is turned to in‐vehicle imaging and analysis. The focus of Chapters 10–12 are technologies that are being applied to driver assistance systems. Chapter 10 by Deshpande and Cai deals with the problem of detecting pedestrians from road‐facing cameras installed on the vehicle. Pedestrian detection is critical to intelligent transportation systems, ranging from autonomous driving to infrastructure surveillance, traffic management, and transit safety and efficiency, as well as law enforcement. In Chapter 11, Casavola, Cario, and Lupia present lane detection (LD) and lane tracking (LT) problems arising in lane departure warning systems (LDWSs). LWDSs refer to specific forms of advanced driver assistant systems (ADAS) designed to help the driver to stay into the lane, by warning her/him with a sufficient advance that an imminent and possibly unintentional lane departure is going to take place so that she/he can take the necessary corrective measures. Chapter 12 by Satzoda and Trivedi teaches the technologies associated with vision‐based integrated techniques for collision avoidance systems. The chapter surveys related technologies and focuses on an integrated approach called efficient lane and vehicle detection using integrated synergies (ELVIS) that incorporates the lane information to detect vehicles more efficiently in an informed manner using a novel two part–based vehicle detection technique.
Driver inattention is a major cause of traffic fatalities worldwide. Chapter 13 by Bala and Bernal presents an overview of in‐vehicle technologies to proactively monitor driver behavior and provide appropriate feedback and intervention to enhance safety and comfort. A broad survey of the state of the art is complemented with a detailed treatment of a few selected driver monitoring techniques including methods to fuse video with nonvisual data such as motion and bio‐signals.
Traffic sign recognition is present in Chapter 14 by Fleyeh. Sign recognition is a field‐concerned detection and recognition of traffic signs in traffic scenes as acquired by a vehicle‐mounted camera. Computer vision and artificial intelligence are used to extract the traffic signs from outdoor images taken in uncontrolled lighting conditions. The signs may be occluded by other objects and may suffer from various problems such as color fading, disorientation, and variations in shape and size. It is the field of study that can be used either to aid the development of an inventory system (for which real‐time recognition is not required), or to aid the development of an in‐car advisory system (when real‐time recognition is necessary). Road condition monitoring is presented in Chapter 15 by Kutila, Pyykönen, Casselgren, and Jonsson. The chapter reviews proposed measurement principles in the road traction monitoring area and provides examples of sensor solutions that are feasible for vehicle on‐board and road sensing. The chapter also reviews opportunities to improve performance with the use of sensor data fusion and discusses future opportunities. We do have an enhanced eBook available with integrated video demonstrations to further explain the concepts discussed in the book.
Robert P. LoceRaja Bala
March 2017
We would like to take this opportunity to thank each and every author for their painstaking efforts in preparing chapters of high quality, depth, and breadth, and also for working collaboratively with the editors to assemble a coherent text in a timely manner.
Don’t forget to visit the companion website for this book:
www.wiley.com/go/loce/ComputerVisionandImaginginITS
There you will find valuable material designed to enhance your learning, including:
Videos
Figures
Raja Bala1and Robert P. Loce2
1Samsung Research America, Richardson, TX, USA
2Conduent Labs, Webster, NY, USA
With rapid advances in driver assistance features leading ultimately to autonomous vehicle technology, the automobile of the future is increasingly relying on advances in computer vision for greater safety and convenience. At the same time, providers of transportation infrastructure and services are expanding their reliance on computer vision to improve safety and efficiency in transportation and addressing a range of problems, including traffic monitoring and control, incident detection and management, road use charging, and road condition monitoring. Computer vision is thus helping to simultaneously solve critical problems at both ends of the transportation spectrum—at the consumer level and at the level of the infrastructure provider. The book aims to provide a comprehensive survey of methods and systems that use both infrastructural and in‐vehicle computer vision technology to address key transportation applications in the following three broad problem domains: (i) law enforcement and security, (ii) efficiency, and (iii) driver safety and comfort. Table 1.1 lists the topics addressed in the text under each of these three domains.
Table 1.1 Taxonomy of problem domains and applications described in the book.
Problem domains
Applications and methods
Imaging system employed
Law enforcement and security
License plate recognition for violations
Infrastructure
Vehicle classification
Infrastructure
Passenger compartment violation detection
Infrastructure, in‐vehicle
Moving violation detection
Infrastructure
Intersection monitoring
Infrastructure
Video anomaly detection
Infrastructure
Efficiency
Traffic flow analysis
Infrastructure
Parking management
Infrastructure, in‐vehicle
License plate recognition for tolling
Infrastructure
Passenger compartment occupancy detection
Infrastructure, in‐vehicle
Driver safety and comfort
Lane departure warning
In‐vehicle
Collision avoidance
In‐vehicle
Pedestrian detection
In‐vehicle
Driver monitoring
In‐vehicle
Traffic sign recognition
In‐vehicle
Road condition monitoring
In‐vehicle, infrastructure
This chapter introduces and motivates applications in the three problem domains and establishes a common computer vision framework for addressing problems in these domains.
Law enforcement and security are critical elements to maintaining the well‐being of individuals and the protection of property. Societies rely on law enforcement agencies to provide these elements. Imaging systems and computer vision are means to sense and interpret situations in a manner that can amplify the effectiveness of officers within these agencies. There are several common elements shared by computer vision law enforcement and security applications, such as the detection and identification of events of interest. On the other hand, there are also distinctions that separate a security application from law enforcement. For instance, prediction and prevention are important for security applications, while accuracy and evidence are essential for law enforcement. In many cases, modules and components of a security system serve as a front end of a law enforcement system. For example, to enforce certain traffic violations, it is necessary to detect and identify the occurrence of that event.
Consider the impact of moving vehicle violations and examples of benefits enabled by computer vision law enforcement systems. There is a strong relationship between excessive speed and traffic accidents. In the United States in 2012, speeding was a contributing factor in 30% of all fatal crashes (10,219 lives) [1]. The economic cost of speeding‐related crashes was estimated to be $52 billion in 2010 [2]. In an extensive review of international studies, automated speed enforcement was estimated to reduce injury‐related crashes by 20–25% [3]. The most commonly monitored moving violations include speeding, running red lights or stop signs, wrong‐way driving, and illegal turns. Most traffic law enforcement applications in roadway computer vision systems involve analyzing well‐defined trajectories and speeds within those trajectories, which leads to clearly defined rules and detections. In some cases, the detections are binary, such as in red light enforcement (stopped or passed through). Other applications require increased accuracy and precision, such as detecting speed violations and applying a fine according to the estimated vehicle speed. There are other deployed applications where the violation involves less definitive criterion, such as reckless driving.
Several moving violations require observation into the passenger compartment of a vehicle. Failure to wear a seat belt and operating a handheld cell phone while driving are two common safety‐related passenger compartment violations. Seat belt use in motor vehicles is the single most effective traffic safety device for preventing death and injury to persons involved in motor vehicle accidents. Cell phone usage alone accounts for roughly 18% of car accidents caused by distracted drivers [4]. In addition, the National Highway Traffic Safety Administration (NHTSA) describes other behaviors resulting in distracted driving, including occupants in the vehicle eating, drinking, smoking, adjusting radio, adjusting environmental controls, and reaching for an object in the car. The conventional approach to enforcement of passenger compartment violations has been through traffic stops by law enforcement officers. This approach faces many challenges such as safety, traffic disruption, significant personnel cost, and the difficulty of determining cell phone usage or seat belt usage at high speed. Imaging technology and computer vision can provide automated or semiautomated enforcement of these violations.
Security of individuals and property is another factor in the monitoring of transportation networks. Video cameras have been widely used for this purpose due to their low cost, ease of installation and maintenance, and ability to provide rich and direct visual information to operators. The use of video cameras enables centralized operations, making it possible for an operator to “coexist” at multiple locations. It is also possible to go back in time and review events of interest. Many additional benefits can be gained by applying computer vision technologies within a camera network. Consider that, traditionally, the output of security cameras has either been viewed and analyzed in real‐time by human operators, or archived for later use if certain events have occurred. The former is error prone and costly, while the latter has lost some critical capabilities such as prediction and prevention. In a medium‐sized city with several thousand roadway cameras, computer vision and video analytics allow a community to fully reap the benefits of analyzing this massive amount of information and highlighting critical events in real‐time or in later forensic analysis.
In certain security and public safety applications, very rapid analysis of large video databases can aid a critical life or death situation. An Amber Alert or a Child Abduction Emergency is an emergency alert system to promptly inform the public when a child has been abducted. It has been successfully implemented in several countries throughout the world. When sufficient information is available about the incident (e.g., description of captor’s vehicle, plate number, and color), a search can be conducted across large databases of video that have been acquired from highway, local road, traffic light, and stop sign monitoring, to track and find the child. Similar to Amber Alert and much more common is Silver Alert, which is a notification issued by local authorities when a senior citizen or mentally impaired person is missing. Statistics indicates that it is highly desirable that an Amber‐/Silver Alert‐related search is conducted in a very fast and efficient manner, as 75% of the abducted are murdered within the first 3 h. Consider a statement from the US West Virginia code on Amber Alert 15‐3A‐7:
