34,79 €
Image processing plays an important role in our daily lives with various applications such as in social media (face detection), medical imaging (X-ray, CT-scan), security (fingerprint recognition) to robotics & space. This book will touch the core of image processing, from concepts to code using Python.
The book will start from the classical image processing techniques and explore the evolution of image processing algorithms up to the recent advances in image processing or computer vision with deep learning. We will learn how to use image processing libraries such as PIL, scikit-mage, and scipy ndimage in Python. This book will enable us to write code snippets in Python 3 and quickly implement complex image processing algorithms such as image enhancement, filtering, segmentation, object detection, and classification. We will be able to use machine learning models using the scikit-learn library and later explore deep CNN, such as VGG-19 with Keras, and we will also use an end-to-end deep learning model called YOLO for object detection. We will also cover a few advanced problems, such as image inpainting, gradient blending, variational denoising, seam carving, quilting, and morphing.
By the end of this book, we will have learned to implement various algorithms for efficient image processing.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 356
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Pravin DhandreAcquisition Editor: Devika BattikeContent Development Editor: Unnati GuhaTechnical Editor: Dinesh ChaudharyCopy Editor:Safis EditingProject Coordinator: Manthan PatelProofreader: Safis EditingIndexer: Pratik ShirodkarGraphics: Jisha ChirayilProduction Coordinator: Shraddha Falebhai
First published: November 2018
Production reference: 1301118
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78934-373-1
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Sandipan Dey is a data scientist with a wide range of interests, covering topics such as machine learning, deep learning, image processing, and computer vision. He has worked in numerous data science fields, working with recommender systems, predictive models for the events industry, sensor localization models, sentiment analysis, and device prognostics. He earned his master's degree in computer science from the University of Maryland, Baltimore County, and has published in a few IEEE Data Mining conferences and journals. He has earned certifications from 100+ MOOCs on data science, machine learning, deep learning, image processing, and related courses/specializations. He is a regular blogger on his blog (sandipanweb) and is a machine learning education enthusiast.
Nikhil Borkar holds a CQF designation and a postgraduate degree in quantitative finance. He also holds the certified financial crime examiner and certified anti-money laundering professional qualifications. He is a registered research analyst with the Securities and Exchange Board of India (SEBI) and has a keen grasp of the Indian regulatory landscape pertaining to securities and investments. He is currently working as an independent FinTech and legal consultant. Prior to this, he worked with Morgan Stanley Capital International (MSCI) as a global RFP project manager.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Hands-On Image Processing with Python
Dedication
About Packt
Why subscribe?
Packt.com
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Disclaimer
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Image Processing
What is image processing and some applications
What is an image and how it is stored on a computer
What is image processing?
Some applications of image processing
The image processing pipeline
Setting up different image processing libraries in Python
Installing pip
Installing some image processing libraries in Python
Installing the Anaconda distribution
Installing Jupyter Notebook
Image I/O and display with Python
Reading, saving, and displaying an image using PIL
Providing the correct path to the images on the disk
Reading, saving, and displaying an image using Matplotlib
Interpolating while displaying with Matplotlib imshow()
Reading, saving, and displaying an image using scikit-image
Using scikit-image's astronaut dataset
Reading and displaying multiple images at once
Reading, saving, and displaying an image using scipy misc
Using scipy.misc's face dataset
Dealing with different image types and file formats and performing basic image manipulations
Dealing with different image types and file formats
File formats
Converting from one file format to another
Image types (modes)
Converting from one image mode into another
Some color spaces (channels)
Converting from one color space into another
Data structures to store images
Converting image data structures
Basic image manipulations
Image manipulations with numpy array slicing 
Simple image morphing - α-blending of two images using cross-dissolving
Image manipulations with PIL
Cropping an image
Resizing an image
Negating an image
Converting an image into grayscale
Some gray-level transformations
Some geometric transformations
Changing pixel values of an image
Drawing on an image
Drawing text on an image
Creating a thumbnail
Computing the basic statistics of an image
Plotting the histograms of pixel values for the RGB channels of an image
Separating the RGB channels of an image 
Combining multiple channels of an image
α-blending two images
Superimposing two images
Adding two images
Computing the difference between two images
Subtracting two images and superimposing two image negatives
Image manipulations with scikit-image
Inverse warping and geometric transformation using the warp() function
Applying the swirl transform
Adding random Gaussian noise to images
Computing the cumulative distribution function of an image 
Image manipulation with Matplotlib
Drawing contour lines for an image
Image manipulation with the scipy.misc and scipy.ndimage modules
Summary
Questions
Further reading
Sampling, Fourier Transform, and Convolution
Image formation – sampling and quantization
Sampling
Up-sampling
Up-sampling and interpolation 
Down-sampling
Down-sampling and anti-aliasing
Quantization
Quantizing with PIL
Discrete Fourier Transform
Why do we need the DFT?
The Fast Fourier Transform algorithm to compute the DFT
The FFT with the scipy.fftpack module
Plotting the frequency spectrum
The FFT with the numpy.fft module
Computing the magnitude and phase of a DFT
Understanding convolution
Why convolve an image?
Convolution with SciPy signal's convolve2d
Applying convolution to a grayscale image
Convolution modes, pad values, and boundary conditions
Applying convolution to a color (RGB) image
Convolution with SciPy ndimage.convolve
Correlation versus convolution
Template matching with cross-correlation between the image and template
Summary
Questions
Further reading
Convolution and Frequency Domain Filtering
Convolution theorem and frequency domain Gaussian blur
Application of the convolution theorem
Frequency domain Gaussian blur filter with numpy fft
Gaussian kernel in the frequency domain
Frequency domain Gaussian blur filter with scipy signal.fftconvolve()
Comparing the runtimes of SciPy convolve() and fftconvolve() with the Gaussian blur kernel
Filtering in the frequency domain (HPF, LPF, BPF, and notch filters)
What is a filter?
High-Pass Filter (HPF)
How SNR changes with frequency cut-off
Low-pass filter (LPF)
LPF with scipy ndimage and numpy fft
LPF with fourier_gaussian
LPF with scipy fftpack
How SNR changes with frequency cutoff
Band-pass filter (BPF) with DoG
Band-stop (notch) filter
Using a notch filter to remove periodic noise from images
Image restoration
Deconvolution and inverse filtering with FFT
Image deconvolution with the Wiener filter
Image denoising with FFT
Filter in FFT
Reconstructing the final image
Summary
Questions
Further reading
Image Enhancement
Point-wise intensity transformations – pixel transformation
Log transform
Power-law transform
Contrast stretching
Using PIL as a point operation
Using the PIL ImageEnhance module
Thresholding
With a fixed threshold
Half-toning
Floyd-Steinberg dithering with error diffusion
Histogram processing – histogram equalization and matching
Contrast stretching and histogram equalization with scikit-image
Histogram matching
Histogram matching for an RGB image
Linear noise smoothing
Smoothing with PIL
Smoothing with ImageFilter.BLUR
Smoothing by averaging with the box blur kernel
Smoothing with the Gaussian blur filter
Comparing smoothing with box and Gaussian kernels using SciPy ndimage
Nonlinear noise smoothing
Smoothing with PIL
Using the median filter
Using max and min filter
Smoothing (denoising) with scikit-image
Using the bilateral filter
Using non-local means
Smoothing with scipy ndimage
Summary
Questions
Further reading
Image Enhancement Using Derivatives
Image derivatives – Gradient and Laplacian
Derivatives and gradients
Displaying the magnitude and the gradient on the same image
Laplacian
Some notes about the Laplacian
Effects of noise on gradient computation
Sharpening and unsharp masking
Sharpening with Laplacian
Unsharp masking
With the SciPy ndimage module
Edge detection using derivatives and filters (Sobel, Canny, and so on)
With gradient magnitude computed using the partial derivatives
The non-maximum suppression algorithm
Sobel edge detector with scikit-image
Different edge detectors with scikit-image – Prewitt, Roberts, Sobel, Scharr, and Laplace
The Canny edge detector with scikit-image
The LoG and DoG filters
The LoG filter with the SciPy ndimage module
Edge detection with the LoG filter
Edge detection with the Marr and Hildreth's algorithm using the zero-crossing computation
Finding and enhancing edges with PIL
Image pyramids (Gaussian and Laplacian) – blending images
A Gaussian pyramid with scikit-image transform pyramid module
A Laplacian pyramid with scikit-image transform's pyramid module
Constructing the Gaussian Pyramid
Reconstructing an image only from its Laplacian pyramid
Blending images with pyramids
Summary
Questions
Further reading
Morphological Image Processing
The scikit-image morphology module
Binary operations
Erosion
Dilation
Opening and closing
Skeletonizing
Computing the convex hull
Removing small objects
White and black top-hats
Extracting the boundary 
Fingerprint cleaning with opening and closing
Grayscale operations
The scikit-image filter.rank module
Morphological contrast enhancement
Noise removal with the median filter
Computing the local entropy
The SciPy ndimage.morphology module
Filling holes in binary objects
Using opening and closing to remove noise
Computing the morphological Beucher gradient
Computing the morphological Laplace
Summary
Questions
Further reading
Extracting Image Features and Descriptors
Feature detectors versus descriptors
Harris Corner Detector
With scikit-image
With sub-pixel accuracy
An application – image matching
Robust image matching using the RANSAC algorithm and Harris Corner features
Blob detectors with LoG, DoG, and DoH
Laplacian of Gaussian (LoG)
Difference of Gaussian (DoG)
Determinant of Hessian (DoH)
Histogram of Oriented Gradients
Algorithm to compute HOG descriptors
Compute HOG descriptors with scikit-image
Scale-invariant feature transform
Algorithm to compute SIFT descriptors
With opencv and opencv-contrib
Application – matching images with BRIEF, SIFT, and ORB
Matching images with BRIEF binary descriptors with scikit-image
Matching with ORB feature detector and binary descriptor using scikit-image
Matching with ORB features using brute-force matching with python-opencv
Brute-force matching with SIFT descriptors and ratio test with OpenCV
Haar-like features
Haar-like feature descriptor with scikit-image
Application – face detection with Haar-like features
Face/eye detection with OpenCV using pre-trained classifiers with Haar-cascade features
Summary
Questions
Further reading
Image Segmentation
What is image segmentation?
Hough transform – detecting lines and circles
Thresholding and Otsu's segmentation
Edges-based/region-based segmentation
Edge-based segmentation
Region-based segmentation
Morphological watershed algorithm
Felzenszwalb, SLIC, QuickShift, and Compact Watershed algorithms 
Felzenszwalb's efficient graph-based image segmentation
SLIC
RAG merging
QuickShift
Compact Watershed
Region growing with SimpleITK 
Active contours, morphological snakes, and GrabCut algorithms
Active contours
Morphological snakes
GrabCut with OpenCV
Summary
Questions
Further reading
Classical Machine Learning Methods in Image Processing
Supervised versus unsupervised learning
Unsupervised machine learning – clustering, PCA, and eigenfaces
K-means clustering for image segmentation with color quantization
Spectral clustering for image segmentation
PCA and eigenfaces 
Dimension reduction and visualization with PCA
2D projection and visualization
Eigenfaces with PCA
Eigenfaces
Reconstruction
Eigen decomposition
Supervised machine learning – image classification
Downloading the MNIST (handwritten digits) dataset
Visualizing the dataset
Training kNN, Gaussian Bayes, and SVM models to classify MNIST 
k-nearest neighbors (KNN) classifier
Squared Euclidean distance
Computing the nearest neighbors
Evaluating the performance of the classifier
Bayes classifier (Gaussian generative model)
Training the generative model – computing the MLE of the Gaussian parameters
Computing the posterior probabilities to make predictions on test data and model evaluation
SVM classifier
Supervised machine learning – object detection
Face detection with Haar-like features and cascade classifiers with AdaBoost – Viola-Jones
Face classification using the Haar-like feature descriptor
Finding the most important Haar-like features for face classification with the random forest ensemble classifier
Detecting objects with SVM using HOG features
HOG training
Classification with the SVM model
Computing BoundingBoxes with HOG-SVM
Non-max suppression
Summary
Questions
Further reading
Deep Learning in Image Processing - Image Classification
Deep learning in image processing
What is deep learning?
Classical versus deep learning
Why deep learning?
CNNs
Conv or pooling or FC layers – CNN architecture and how it works
Convolutional layer
Pooling layer
Non-linearity – ReLU layer
FC layer
Dropout
Image classification with TensorFlow or Keras
Classification with TF
Classification with dense FC layers with Keras
Visualizing the network
Visualizing the weights in the intermediate layers 
CNN for classification with Keras
Classifying MNIST
Visualizing the intermediate layers 
Some popular deep CNNs
VGG-16/19
Classifying cat/dog images with VGG-16 in Keras
Training phase
Testing (prediction) phase
InceptionNet
ResNet
Summary
Questions
Further reading
Deep Learning in Image Processing - Object Detection, and more
Introducing YOLO v2 
Classifying and localizing images and detecting objects
Proposing and detecting objects using CNNs
Using YOLO v2 
Using a pre-trained YOLO model for object detection
Deep semantic segmentation with DeepLab V3+
Semantic segmentation
DeepLab V3+
DeepLab v3 architecture
Steps you must follow to use DeepLab V3+ model for semantic segmentation
Transfer learning – what it is, and when to use it
Transfer learning with Keras
Neural style transfers with cv2 using a pre-trained torch model
Understanding the NST algorithm
Implementation of NST with transfer learning
Ensuring NST with content loss
Computing the style cost
Computing the overall loss
Neural style transfer with Python and OpenCV
Summary
Questions
Further reading
Additional Problems in Image Processing
Seam carving
Content-aware image resizing with seam carving
Object removal with seam carving
Seamless cloning and Poisson image editing
Image inpainting
Variational image processing
Total Variation Denoising
Creating flat-texture cartoonish images with total variation denoising
Image quilting
Texture synthesis
Texture transfer
Face morphing
Summary
Questions
Further reading
Other Books You May Enjoy
Leave a review - let other readers know what you think
This book covers how to solve image processing problems using popular Python image processing libraries (such as PIL, scikit-image, python-opencv, scipy ndimage, and SimpleITK), machine learning libraries (scikit-learn), and deep learning libraries (TensorFlow, Keras). It will enable the reader to write code snippets to implement complex image processing algorithms, such as image enhancement, filtering, restoration, segmentation, classification, and object detection. The reader will also be able to use machine learning and deep learning models to solve complex image processing problems.
The book will start with the basics and guide the reader to go to an advanced level by providing Python-reproducible implementations throughout the book. The book will start from the classical image processing techniques and explore the journey of evolution of the image processing algorithms all the way through to the recent advances in image processing/computer vision with deep learning. Readers will learn how to use the image processing libraries, such as PIL, scikit-image, and scipy ndimage in Python, which will enable them to write code snippets in Python 3 and quickly implement complex image processing algorithms, such as image enhancement, filtering, segmentation, object detection, and classification. The reader will learn how to use machine learning models using the scikit-learn library and later explore deep CNN such as VGG-19 with TensorFlow/Keras, use the end-to-end deep learning YOLO model for object detection, and DeepLab V3+ for semantic segmentation and neural-style transfer models. The reader will also learn a few advanced problems, such as image inpainting, gradient blending, variational denoising, seam carving, quilting, and morphing. By the end of this book, the reader will learn to implement various algorithms for efficient image processing.
This book follows a highly practical approach that will take its readers through a set of image processing concepts/algorithms and help them learn, in detail, how to use leading Python library functions to implement these algorithms.
The images used in this book as inputs and the outputs can be found at https://www.packtpub.com/sites/default/files/downloads/9781789343731_ColorImages.pdf.
This book is for engineers/applied researchers, and also for software engineers interested in computer vision, image processing, machine learning, and deep learning, especially for readers who are adept at Python programming and who want to explore various topics on image processing in detail and solve a range of complex problems, starting from concept through to implementation. A math and programming background, along with some basic knowledge of machine learning, are prerequisites.
Chapter 1, Getting Started with Image Processing, covers image processing and its applications, different Python libraries, image input/output, data structures, file formats, and basic image manipulations.
Chapter 2, Sampling, Fourier Transform, and Convolution, covers 2D Fourier transform, sampling, quantization, discrete Fourier transform, 1D and 2D convolution and filtering in the frequency domain, and how to implement them with Python using examples. You will learn the simple signal processing tools that are needed in order to understand the following units.
Chapter 3, Convolution and Frequency Domain Filtering, demonstrates how convolution is carried out on images using Python. Topics such as filtering in the frequency domain are also covered.
Chapter 4, Image Enhancement, covers some of the most basic tools in image processing, such as mean/median filtering and histogram equalization, which are still among the most powerful. We will describe these and provide a modern interpretation of these basic tools.
Chapter 5, Image Enhancement using Derivatives, covers further topics associated with image enhancement, in other words, the problem of improving the appearance or usefulness of an image. Topics covered include edge detection with derivatives and Laplacian, sharpening, and pseudo coloring. All the concepts will be described with the help of examples involving Python.
Chapter 6, Morphological Image Processing, covers binary operations and the use of filter rank module to perform operations such as morphological contrast enhancements, noise removal, and computing local entropy. We will also see how a morphology module is used.
Chapter 7, Extracting Image Features and Descriptors, describes several techniques for extracting features from images/compute image descriptors.
Chapter 8, Image Segmentation, outlines the basic techniques for partitioning an image, from a simple threshold to more advanced graph cuts.
Chapter 9, Classical Machine Learning Methods in Image Processing, introduces a number of different machine learning techniques for image classification and object detection/recognition.
Chapter 10, Deep Learning in Image Processing – Image Classification, describes why the image processing/computer vision community gradually transitioned from the classical feature-based machine learning models to deep learning models.
Chapter 11, Deep Learning in Image Processing - Object Detection, and more, describes a number of remarkable applications of the CNNs for object detection, semantic segmentation, and image style transfer. A few popular models, such as YOLO and object proposals, will be demonstrated. How to use transfer learning to avoid learning a very deep neural net from scratch will also be outlined.
Chapter 12, Additional Problems in Image Processing, describes a number of additional image processing problems and various algorithms for solving them. Problems include seam carving (for context-aware image resizing), image quilting (for image resizing with non-parametric sampling and texture transfer), poisson (gradient) image editing (blending) to seamlessly blend one image within another, image morphing (to transform one image to another), image inpainting (to restore a degraded image), and some variational image processing techniques (for image denoising, for example).
A basic knowledge of Python is required to run the codes, along with access
to image datasets and the GitHub link.
A basic Math background is also needed to understand the concepts.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Image-Processing-with-Python. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789343731_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
As the name suggests, image processing can simply be defined as the processing (analyzing and manipulating) of images with algorithms in a computer (through code). It has a few different aspects, such as storage, representation, information extraction, manipulation, enhancement, restoration, and interpretation of images. In this chapter, we are going to give a basic introduction to all of these different aspects of image processing, along with an introduction to hands-on image processing with Python libraries. We are going to use Python 3 for all of the code samples in this book.
We will start by defining what image processing is and what the applications of image processing are. Then we will learn about the basic image processing pipeline—in other words, what are the steps to process an image on a computer in general. Then, we will learn about different Python libraries available for image processing and how to install them in Python 3. Next, we will learn how to write Python codes to read and write (store) images on a computer using different libraries. After that, we will learn the data structures that are to be used to represent an image in Python and how to display an image. We will also learn different image types and different image file formats, and, finally, how to do basic image manipulations in Python.
By the end of this chapter, we should be able to conceptualize image processing, different steps, and different applications. We should be able to import and call functions from different image processing libraries in Python. We should be able to understand the data structures used to store different types of images in Python, read/write image files using different Python libraries, and write Python code to do basic image manipulations. The topics to be covered in this chapter are as follows:
What image processing
is
and some
image processing
applications
The image processing pipeline
Setting up different image processing libraries in Python
Image I/O and display with Python
Image types, file formats, and basic image manipulations
Let's start by defining what is an image, how it is stored on a computer, and how we are going to process it with Python.
Conceptually, an image in its simplest form (single-channel; for example, binary or mono-chrome, grayscale or black and white images) is a two-dimensional function f(x,y) that maps a coordinate-pair to an integer/real value, which is related to the intensity/color of the point. Each point is called a pixel or pel (picture element). An image can have multiple channels too (for example, colored RGB images, where a color can be represented using three channels—red, green, and blue). For a colored RGB image, each pixel at the (x,y) coordinate can be represented by a three-tuple (rx,y, gx,y, bx,y).
In order to be able to process it on a computer, an image f(x,y) needs to be digitalized both spatially and in amplitude. Digitization of the spatial coordinates (x,y) is called image sampling. Amplitude digitization is called gray-level quantization. In a computer, a pixel value corresponding to a channel is generally represented as an integer value between (0-255) or a floating-point value between (0-1). An image is stored as a file, and there can be many different types (formats) of files. Each file generally has some metadata and some data that can be extracted as multi-dimensional arrays (for example, 2-D arrays for binary or gray-level images and 3D arrays for RGB and YUV colored images). The following figure shows how the image data is stored as matrices for different types of image. As shown, for a grayscale image, a matrix (2-D array) of width x height suffices to store the image, whereas an RGB image requires a 3-D array of a dimension of width x height x 3:
The next figure shows example binary, grayscale, and RGB images:
In this book, we shall focus on processing image data and will use Python libraries to extract the data from images for us, as well as run different algorithms for different image processing tasks on the image data. Sample images are taken from the internet, from the Berkeley Segmentation Dataset and Benchmark (https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/BSDS300/html/dataset/images.html), and the USC-SIPI Image Database (http://sipi.usc.edu/database/), and many of them are standard images used for image processing.
Image processing refers to the automatic processing, manipulation, analysis, and interpretation of images using algorithms and codes on a computer. It has applications in many disciplines and fields in science and technology such as television, photography, robotics, remote sensing, medical diagnosis, and industrial inspection. Social networking sites such as Facebook and Instagram, which we have got used to in our daily lives and where we upload tons of images every day, are typical examples of the industries that need to use/innovate many image processing algorithms to process the images we upload.
In this book, we are going to use a few Python packages to process an image. First, we shall use a bunch of libraries to do classical image processing: right from extracting image data, transforming the data with some algorithms using library functions to pre-process, enhance, restore, represent (with descriptors), segment, classify, and detect and recognize (objects) to analyze, understand, and interpret the data better. Next, we shall use another bunch of libraries to do image processing based on deep learning, a technology that has became very popular in the last few years.
Some typical applications of image processing include medical/biological fields (for example, X-rays and CT scans), computational photography (Photoshop), fingerprint authentication, face recognition, and so on.
The following steps describe the basic steps in the image processing pipeline:
Acquisition and s
torage
: T
he image needs to be captured (using a camera, for example) and stored on some device (such as a hard disk) as a file (for example, a JPEG file).
Load into memory and save to disk
:
The image needs to be read from the disk into memory and stored using some data structure (for example,
numpy ndarray
), and the data structure needs to be serialized into an image file later, possibly after running some algorithms on the image.
Manipulation, enhancement, and restoration:
We need to run some pre-processing
algorithms
to do the following:
Run a few transformations on the image (sampling and manipulation; for example, grayscale conversion)
Enhance the quality of the image (filtering; for example, deblurring)
Restore the image from noise degradation
Segmentation
: The image needs to be segmented in order to extract the objects of interest.
Information extraction/representation
: The image needs to be represented in some alternative form; for example, one of the following:
Some hand-crafted feature-descriptor can be computed (for example, HOG descriptors, with classical image processing) from the image
Some features can be automatically learned from the image (for example, the weights and bias values learned in the hidden layers of a neural net with deep learning)
The image is going to be represented using that alternative representation
Image understanding/interpretation
:
This representation will be used to understand the image better with the following:
Image classification
(for example,
whether an image contains a human object or not)
Object recognition
(
for example
,
finding the location of the car objects in an image with a bounding box)
The following diagram describes the different steps in image processing:
The next figure represents different modules that we are going to use for different image processing tasks:
Apart from these libraries, we are going to use the following:
scipy.ndimage
and
opencv
for different image processing tasks
scikit-learn
for classical machine learning
tensorflow
and
keras
for deep learning
The next few paragraphs describe to install different image processing libraries and set up the environment for writing codes to process images using classical image processing techniques in Python. In the last few chapters of this book, we will need to use a different setup when we use deep-learning-based methods.
We are going to use the pip(orpip3) tool to install the libraries, so—if it isn't already installed—we need to install pip first. As mentioned here (https://pip.pypa.io/en/stable/installing/#do-i-need-to-install-pip), pip is already installed if we are using Python 3 >=3.4 downloaded from python.org, or if we are working in a Virtual Environment (https://packaging.python.org/tutorials/installing-packages/#creating-and-using-virtual-environments) created by virtualenv (https://packaging.python.org/key_projects/#virtualenv) or pyvenv (https://packaging.python.org/key_projects/#venv). We just need to make sure to upgrade pip (https://pip.pypa.io/en/stable/installing/#upgrading-pip). How to install pip for different OSes or platforms can be found here: https://stackoverflow.com/questions/6587507/how-to-install-pip-with-python-3.
In Python, there are many libraries that we can use for image processing. The ones we are going to use are: NumPy, SciPy, scikit-image, PIL (Pillow), OpenCV, scikit-learn, SimpleITK, and Matplotlib.
The matplotliblibrary will primarily be used for display purposes, whereas numpy will be used for storing an image. The scikit-learn library will be used for building machine-learning models for image processing, and scipy will be used mainly for image enhancements. The scikit-image, mahotas, and opencv libraries will be used for different image processing algorithms.
The following code block shows how the libraries that we are going to use can be downloaded and installed with pip from a Python prompt (interactive mode):
>>> pip install numpy
>>> pip install scipy
>>> pip install scikit-image
>>> pip install scikit-learn
>>> pip install pillow
>>> pip install SimpleITK
>>> pip install opencv-python
>>> pip install matplotlib
There may be some additional installation instructions, depending on the OS platform you are going to use. We suggest the reader goes through the documentation sites for each of the libraries to get detailed platform-specific installation instructions for each library. For example, for the scikit-image library, detailed installation instructions for different OS platforms can be found here: http://scikit-image.org/docs/stable/install.html. Also, the reader should be familiar with websites such as stackoverflow to resolve platform-dependent installation issues for different libraries.
Finally, we can verify whether a library is properly installed or not by importing it from the Python prompt. If the library is imported successfully (no error message is thrown), then we don't have any installation issue. We can print the version of the library installed by printing it to the console.
The following code block shows the versions for the scikit-image and PIL Python libraries:
>>> import skimage, PIL, numpy>>> print(skimage.__version__)# 0.14.0>>> PIL.__version__# 5.1.0 >>> numpy.__version__# 1.14.5
Let us ensure that we have the latest versions of all of the libraries.
We also recommend to download and install the latest version of the Anaconda distribution; this will eliminate the need for explicit installation of many Python packages.
We are going to use Jupyter notebooks to write our Python code. So, we need to install thejupyter package first from a Python prompt with >>> pip install jupyter, and then launch the Jupyter Notebook app in the browser using >>> jupyter notebook. From there, we can create new Python notebooks and choose a kernel. If we use Anaconda, we do not need to install Jupyter explicitly; the latest Anaconda distribution comes with Jupyter.
We can even install a Python package from inside a notebook cell; for example, we can installscipy with the !pip install scipy command.
Images are stored as files on the disk, so reading and writing images from the files are disk I/O operations. These can be done using many ways using different libraries; some of them are shown in this section. Let us first start by importing all of the required packages:
# for inline image display inside notebook# % matplotlib inline import numpy as npfrom PIL import Image, ImageFont, ImageDrawfrom PIL.ImageChops import add, subtract, multiply, difference, screenimport PIL.ImageStat as statfrom skimage.io import imread, imsave, imshow, show,
imread_collection, imshow_collection
from skimage import color, viewer, exposure, img_as_float, datafrom skimage.transform import SimilarityTransform, warp, swirlfrom skimage.util import invert, random_noise, montageimport matplotlib.image as mpimgimport matplotlib.pylab as pltfrom scipy.ndimage import affine_transform, zoom
from scipy import misc
We recommend creating a folder (sub-directory) to store images to be used for processing (for example, for the Python code samples, we have used the images stored inside a folder named images) and then provide the path to the folder to access the image to avoid the file not found exception.
The imshow() function from Matplotlib provides many different types of interpolation methods to plot an image. These functions can be particularly useful when the image to be plotted is small. Let us use the small 50 x 50 lena
