E-Book
40,81 €

OpenCV 4 with Python Blueprints E-Book

Dr. Menua Gevorgyan

0,0

40,81 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Get to grips with traditional computer vision algorithms and deep learning approaches, and build real-world applications with OpenCV and other machine learning frameworks

Key Features

Understand how to capture high-quality image data, detect and track objects, and process the actions of animals or humans

Implement your learning in different areas of computer vision

Explore advanced concepts in OpenCV such as machine learning, artificial neural network, and augmented reality

Book Description

OpenCV is a native cross-platform C++ library for computer vision, machine learning, and image processing. It is increasingly being adopted in Python for development. This book will get you hands-on with a wide range of intermediate to advanced projects using the latest version of the framework and language, OpenCV 4 and Python 3.8, instead of only covering the core concepts of OpenCV in theoretical lessons. This updated second edition will guide you through working on independent hands-on projects that focus on essential OpenCV concepts such as image processing, object detection, image manipulation, object tracking, and 3D scene reconstruction, in addition to statistical learning and neural networks.

You'll begin with concepts such as image filters, Kinect depth sensor, and feature matching. As you advance, you'll not only get hands-on with reconstructing and visualizing a scene in 3D but also learn to track visually salient objects. The book will help you further build on your skills by demonstrating how to recognize traffic signs and emotions on faces. Later, you'll understand how to align images, and detect and track objects using neural networks.

By the end of this OpenCV Python book, you'll have gained hands-on experience and become proficient at developing advanced computer vision apps according to specific business needs.

What you will learn

Generate real-time visual effects using filters and image manipulation techniques such as dodging and burning

Recognize hand gestures in real-time and perform hand-shape analysis based on the output of a Microsoft Kinect sensor

Learn feature extraction and feature matching to track arbitrary objects of interest

Reconstruct a 3D real-world scene using 2D camera motion and camera reprojection techniques

Detect faces using a cascade classifier and identify emotions in human faces using multilayer perceptrons

Classify, localize, and detect objects with deep neural networks

Who this book is for

This book is for intermediate-level OpenCV users who are looking to enhance their skills by developing advanced applications. Familiarity with OpenCV concepts and Python libraries, and basic knowledge of the Python programming language are assumed.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 427

Veröffentlichungsjahr: 2020

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

OpenCV 4 with Python BlueprintsSecond Edition

Build creative computer vision projects with the latest version of OpenCV 4 and Python 3

Dr. Menua Gevorgyan Arsen Mamikonyan Michael Beyeler

BIRMINGHAM - MUMBAI

OpenCV 4 with Python Blueprints Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Richa TripathiAcquisition Editor: Denim PintoContent Development Editor: Rosal ColacoSenior Editor: Afshaan KhanTechnical Editor: Ketan KambleCopy Editor: Safis EditingProject Coordinator:Francy PuthiryProofreader: Safis EditingIndexer:Priyanka DhadkeProduction Designer:Aparna Bhagat

First published: October 2015 Second edition: March 2020

Production reference: 1190320

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78980-181-1

www.packt.com

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the authors

Dr. Menua Gevorgyan is an experienced researcher with a demonstrated history of working in the information technology and services industry. He is skilled in computer vision, deep learning, machine learning, and data science as well as having a lot of experience with OpenCV and Python programming. He is interested in machine perception and machine understanding problems, and wonders if it is possible to make a machine perceive the world as a human does.

I would like to thank Rosal Colaco, for the dedicated work to improve the book's quality, as well as Sandeep Mishra, for proposing the book.

Arsen Mamikonyan is an experienced machine learning specialist with demonstrated work experience in Silicon Valley and London, and teaching experience at the American University of Armenia. He is skilled in applied machine learning and data science and has built real-life applications using Python and OpenCV, among others. He holds a master's degree in engineering (MEng) with a concentration on artificial intelligence from the Massachusetts Institute of Technology.

I would like to thank my wife, Lusine, and my parents, Gayane and Andranik, for encouraging and putting up with me while I was writing this book. I would like to thank my coauthor, Menua, for bearing with my hectic work schedule and keeping my motivation high while we worked on this project.

Michael Beyeler is a postdoctoral fellow in neuroengineering and data science at the University of Washington, where he is working on computational models of bionic vision in order to improve the perceptual experience of blind patients implanted with a retinal prosthesis (bionic eye).

His work lies at the intersection of neuroscience, computer engineering, computer vision, and machine learning. He is also an active contributor to several open source software projects, and has professional programming experience in Python, C/C++, CUDA, MATLAB, and Android. Michael received a PhD in computer science from the University of California, Irvine, and an MSc in biomedical engineering and a BSc in electrical engineering from ETH Zurich, Switzerland.

About the reviewer

Sri Manikanta Palakollu is an undergraduate student pursuing his bachelor's degree in computer science and engineering at SICET under JNTUH. He is a founder of the OpenStack Developer Community in his college.

He started his journey as a competitive programmer. He loves to solve problems related to the data science field. His interests include data science, app development, web development, cybersecurity, and technical writing. He has published many articles on data science, machine learning, programming, and cybersecurity with publications such as Hacker Noon, freeCodeCamp, Towards Data Science, and DDI.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

OpenCV 4 with Python Blueprints Second Edition

About Packt

Why subscribe?

Contributors

About the authors

About the reviewer

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Code in Action

Download the color images

Conventions used

Get in touch

Reviews

Fun with Filters

Getting started

Planning the app

Creating a black-and-white pencil sketch

Understanding approaches for using dodging and burning techniques

Implementing a Gaussian blur with two-dimensional convolution

Applying pencil sketch transformation

Using an optimized version of a Gaussian blur

Generating a warming and cooling filter

Using color manipulation via curve shifting

Implementing a curve filter using lookup tables

Designing the warming and cooling effect

Cartoonizing an image

Using a bilateral filter for edge-aware smoothing

Detecting and emphasizing prominent edges

Combining colors and outlines to produce a cartoon

Putting it all together

Running the app

Mapping the GUI base class

Understanding the GUI constructor

Learning about a basic GUI layout

Handling video streams

Drafting a custom filter layout

Summary

Attributions

Hand Gesture Recognition Using a Kinect Depth Sensor

Getting started

Planning the app

Setting up the app

Accessing the Kinect 3D sensor

Utilizing OpenNI-compatible sensors

Running the app and main function routine

Tracking hand gestures in real time

Understanding hand region segmentation

Finding the most prominent depth of the image center region

Applying morphological closing for smoothening

Finding connected components in a segmentation mask

Performing hand shape analysis

Determining the contour of the segmented hand region

Finding the convex hull of a contour area

Finding the convexity defects of a convex hull

Performing hand gesture recognition

Distinguishing between different causes of convexity defects

Classifying hand gestures based on the number of extended fingers

Summary

Finding Objects via Feature Matching and Perspective Transforms

Getting started

Listing the tasks performed by the app

Planning the app

Setting up the app

Running the app – the main() function routine

Displaying results

Understanding the process flow

Learning feature extraction

Looking at feature detection

Detecting features in an image with SURF

Obtaining feature descriptors with SURF

Understanding feature matching

Matching features across images with FLANN

Testing the ratio for outlier removal

Visualizing feature matches

Mapping homography estimation

Warping the image

Learning feature tracking

Understanding early outlier detection and rejection

Seeing the algorithm in action

Summary

Attributions

3D Scene Reconstruction Using Structure from Motion

Getting started

Planning the app

Learning about camera calibration

Understanding the pinhole camera model

Estimating the intrinsic camera parameters

Defining the camera calibration GUI

Initializing the algorithm

Collecting image and object points

Finding the camera matrix

Setting up the app

Understanding the main routine function

Implementing the SceneReconstruction3D class

Estimating the camera motion from a pair of images

Applying point matching with rich feature descriptors

Using point matching with optic flow

Finding the camera matrices

Applying image rectification

Reconstructing the scene

Understanding 3D point cloud visualization

Learning about structure from motion

Summary

Using Computational Photography with OpenCV

Getting started

Planning the app

Understanding the 8-bit problem

Learning about RAW images

Using gamma correction

Understanding high-dynamic-range imaging

Exploring ways to vary exposure

Shutter speed

Aperture

ISO speed

Generating HDR images using multiple exposure images

Extracting exposure strength from images

Estimating the camera response function

Writing an HDR script using OpenCV

Displaying HDR images

Understanding panorama stitching

Writing script arguments and filtering images

Figuring out relative positions and the final picture size

Finding camera parameters

Creating the canvas for the panorama

Blending the images together

Improving panorama stitching

Summary

Preface

The goal of this book is to get you hands-on with a wide range of intermediate to advanced projects using the latest version of the OpenCV 4 framework and the Python 3.8 language instead of only covering the core concepts of computer vision in theoretical lessons.

This updated second edition has increased the depth of the concepts we tackle with OpenCV. It will guide you through working on independent hands-on projects that focus on essential computer vision concepts such as image processing, 3D scene reconstruction, object detection, and object tracking. It will also cover, with real-life examples, statistical learning and deep neural networks.

You will begin by understanding concepts such as image filters and feature matching, as well as using custom sensors such as the Kinect depth sensor. You will also learn how to reconstruct and visualize a scene in 3D, how to align images, and how to combine multiple images into a single one. As you advance through the book, you will learn how to recognize traffic signs and emotions on faces and detect and track objects in video streams using neural networks, even if they disappear for short periods of time.

By the end of this OpenCV and Python book, you will have hands-on experience and be proficient at developing your own advanced computer vision applications according to specific business needs.Throughout the book, you will explore multiple machine learning and computer vision models such as Support Vector Machines (SVMs) and convolutional neural networks.

Who this book is for

This book is aimed at computer vision enthusiasts in pursuit of mastering their skills by developing advanced practical applications using OpenCV and other machine learning libraries.

Basic programming skills and Python programming knowledge is assumed.

What this book covers

Chapter 1, Fun with Filters, explores a number of interesting image filters (such as a black-and-white pencil sketch, warming/cooling filters, and a cartoonizer effect), and we'll apply them to the video stream of a webcam in real time.

Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor, helps you develop an app to detect and track simple hand gestures in real time using the output of a depth sensor, such as Microsoft Kinect 3D Sensor or Asus Xtion.

Chapter 3, Finding Objects via Feature Matching and Perspective Transforms, helps you develop an app to detect an arbitrary object of interest in the video stream of a webcam, even if the object is viewed from different angles or distances, or under partial occlusion.

Chapter 4, 3D Scene Reconstruction Using Structure from Motion, shows you how to reconstruct and visualize a scene in 3D by inferring its geometrical features from camera motion.

Chapter 5, Using Computational Photography with OpenCV, helps you develop command-line scripts that take images as input and produce panoramas or High Dynamic Range (HDR) images. The scripts will either align the images so that there is a pixel-to-pixel correspondence or stitch them creating a panorama, which is an interesting application of image alignment. In a panorama, the two images are not that of a plane but that of a 3D scene. In general, 3D alignment requires depth information. However, when the two images are taken by rotating the camera about its optical axis (as in the case of panoramas), we can align two images of a panorama.

Chapter 6, Tracking Visually Salient Objects, helps you develop an app to track multiple visuallysalient objects in a video sequence (such as all the players on the field during a soccer match) at once.

Chapter 7, Learning to Recognize Traffic Signs, shows you how to train a support vector machine to recognize traffic signs from the German Traffic Sign Recognition Benchmark (GTSRB) dataset.

Chapter 8, Learning to Recognize Facial Emotions, helps you develop an app that is able to both detect faces and recognize their emotional expressions in the video stream of a webcam in real time.

Chapter 9, Learning to Recognize Facial Emotions, walks you through developing an app for real-time object classification with deep convolutional neural networks. You will modify a classifier network to train on a custom dataset with custom classes. You will learn how to train a Keras model on a dataset and how to serialize and save your Keras model to a disk. You will then see how to classify new input images using your loaded Keras model. You will train a convolutional neural network using the image data you have to get a good classifier that will have very high accuracy.

Chapter 10, Learning to Detect and Track Objects, guides you as you develop an app for real-time object detection with deep neural networks, connecting it to a tracker. You will learn how object detectors work and how they are trained. You will implement a Kalman filter-based tracker, which will use object position and velocity to predict where it is likely to be. After completing this chapter, you will be able to build your own real-time object detection and tracking applications.

Appendix A, Profiling and Accelerating Your Apps, covers how to find bottlenecks in an app and achieve CPU- and CUDA-based GPU acceleration of existing code with Numba.

Appendix B, Setting Up a Docker Container, walks you through replicating the environment that we have used to run the code in this book.

To get the most out of this book

All of our code use Python 3.8, which is available on a variety of operating systems, such as Windows, GNU Linux, macOS, and others. We have made an effort to use only libraries that are available on these three operating systems. We will go over the exact versions of each of the dependencies we have used, which can be installed using pip (Python's dependency management system). If you have trouble getting any of these working, we have Dockerfiles available with which we have tested all the code in this book, which we cover in Appendix B, Setting Up a Docker Container.

Here is a list of dependencies that we have used, with the chapters they were used in:

Software required

Version

Chapter number

Download links to the software

Python

3.8

All

https://www.python.org/downloads/

OpenCV

4.2

All

https://opencv.org/releases/

NumPy

1.18.1

All

http://www.scipy.org/scipylib/download.html

wxPython

4.0

1, 4, 8

http://www.wxpython.org/download.php

matplotlib

3.1

4, 5, 6, 7

http://matplotlib.org/downloads.html

SciPy

1.4

1, 10

http://www.scipy.org/scipylib/download.html

rawpy

0.14

https://pypi.org/project/rawpy/

ExifRead

2.1.2

https://pypi.org/project/ExifRead/

TensorFlow

2.0

7, 9

https://www.tensorflow.org/install

In order to run the codes, you will need a regular laptop or Personal Computer (PC). Some chapters require a webcam, which can be either an embedded laptop camera or an external one. Chapter 2, Hand Gesture Recognition Using a Kinect Depth Sensor also requires a depth sensor that can be either a Microsoft 3D Kinect sensor or any other sensor, which is supported either by the libfreenect library or OpenCV, such as ASUS Xtion.

We have tested this using Python 3.8 and Python 3.7, on Ubuntu 18.04.

If you already have Python on your computer, you can just get going with running the following on your terminal:

$ pip install -r requirements.txt

Here, requirements.txt is provided in the GitHub repository of the project, and has the following contents (which is the previously given table in a text file):

wxPython==4.0.5

numpy==1.18.1

scipy==1.4.1

matplotlib==3.1.2

requests==2.22.0

opencv-contrib-python==4.2.0.32

opencv-python==4.2.0.32

rawpy==0.14.0

ExifRead==2.1.2

tensorflow==2.0.1

Alternatively, you can follow the instructions in Appendix B, Setting Up a Docker Container, to get everything working with a Docker container.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

Support

tab.

Click on

Code Downloads

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/OpenCV-4-with-Python-Blueprints-Second-Edition. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Code in Action

Code in Action videos for this book can be viewed at http://bit.ly/2xcjKdS.

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://static.packt-cdn.com/downloads/9781789801811_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "We will use argparse as we want our script to accept arguments."

A block of code is set as follows:

import argparseimport cv2import numpy as npfrom classes import CLASSES_90from sort import Sort

Any command-line input or output is written as follows:

$ python chapter8.py collect

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Fun with Filters

The goal of this chapter is to develop a number of image processing filters and then apply them to the video stream of a webcam in real time. These filters will rely on various OpenCV functions to manipulate matrices through splitting, merging, arithmetic operations, and applying lookup tables for complex functions.

We will cover the following three effects, which will help familiarize you with OpenCV, and we will build on these effects in future chapters of this book:

Warming and cooling filters

: We will implement our own

curve filters

using a lookup table.

Black-and-white pencil sketch

: We will make use of two image-blending techniques, known as

dodging

and

burning

Cartoonizer

: We will combine a bilateral filter, a median filter, and adaptive thresholding.

OpenCV is an advanced toolchain. It often raises the question, that is, not how to implement something from scratch, but which precanned implementation to choose for your needs. Generating complex effects is not hard if you have a lot of computing resources to spare. The challenge usually lies in finding an approach that not only gets the job done but also gets it done in time.

Instead of teaching the basic concepts of image manipulation through theoretical lessons, we will take a practical approach and develop a single end-to-end app that integrates a number of image filtering techniques. We will apply our theoretical knowledge to arrive at a solution that not only works but also speeds up seemingly complex effects so that a laptop can produce them in real time.

In this chapter, you will learn how to do the following using OpenCV:

Creating a black-and-white pencil sketch

Applying pencil sketch transformation

Generating a warming and cooling filter

Cartoonizing an image

Putting it all together

Learning this will allow you to familiarize yourself with loading images into OpenCV and applying different transformations to those images using OpenCV. This chapter will help you learn the basics of how OpenCV operates, so we can focus on the internals of the algorithms in the following chapters.

Now, let's take a look at how to get everything up and running.

Getting started

All of the code in this book is targeted for OpenCV 4.2 and has been tested on Ubuntu 18.04. Throughout this book, we will make extensive use of the NumPy package (http://www.numpy.org).

Additionally, this chapter requires the UnivariateSpline module of the SciPy package (http://www.scipy.org) and the wxPython 4.0Graphical User Interface(GUI) (http://www.wxpython.org/download.php) for cross-platform GUI applications. We will try to avoid further dependencies where possible.

For more book-level dependencies, see Appendix A, Profiling and Accelerating Your Apps, and Appendix B, Setting Up a Docker Container.

You can find the code that we present in this chapter at our GitHub repository here: https://github.com/PacktPublishing/OpenCV-4-with-Python-Blueprints-Second-Edition/tree/master/chapter1.

Let's begin by planning the application we are going to create in this chapter.

Planning the app

The final app must consist of the following modules and scripts:

wx_gui.py

: This module is our implementation of a basic GUI using

wxpython

. We will make extensive use of this file throughout the book. This module includes the following layouts:

wx_gui.BaseLayout

: This is a generic layout class from which more complicated layouts can be built.

chapter1.py

: This is the main script for this chapter. It contains the following functions and classes:

chapter1.FilterLayout

: This is a custom layout based on

wx_gui.BaseLayout

, which displays the camera feed and a row of radio buttons that allows the user to select from the available image filters to be applied to each frame of the camera feed.

chapter1.main

: This is the main

routine

function for starting the GUI application and accessing the webcam.

tools.py

: This is a Python module and has a lot of helper functions that we use in this chapter, which you can reuse for your projects.

The next section demonstrates how to create a black-and-white pencil sketch.

Creating a black-and-white pencil sketch

In order to obtain a pencil sketch (that is, a black-and-white drawing) of the camera frame, we will make use of two image-blending techniques, known as dodging and burning. These terms refer to techniques employed during the printing process in traditional photography; here, photographers would manipulate the exposure time of a certain area of a darkroom print in order to lighten or darken it. Dodging lightens an image, whereas burning darkens it. Areas that were not supposed to undergo changes were protected with a mask.

Today, modern image editing programs, such as Photoshop and Gimp, offer ways to mimic these effects in digital images. For example, masks are still used to mimic the effect of changing the exposure time of an image, wherein areas of a mask with relatively intense values will expose the image more, thus lightening the image. OpenCV does not offer a native function to implement these techniques; however, with a little insight and a few tricks, we will arrive at our own efficient implementation that can be used to produce a beautiful pencil sketch effect.

If you search on the internet, you might stumble upon the following common procedure to achieve a pencil sketch from an RGB (red, green, and blue) color image:

First, convert the color image to grayscale.

Then, invert the grayscale image to get a negative.

Apply a

Gaussian blur

to the negative from

step 2

Blend the grayscale image (from

step 1

) with the blurred negative (from

step 3

) by using

color dodge

Whereas steps 1 to 3 are straightforward, step 4 can be a little tricky. Let's get that one out of the way first.

OpenCV 3 came with a pencil sketch effect right out of the box. The cv2.pencilSketch function uses a domain filter introduced in the 2011 paper, Domain Transform for Edge-Aware Image and Video Processing, by Eduardo Gastal and Manuel Oliveira. However, for the purposes of this book, we will develop our own filter.

The next section shows you how to implement dodging and burning in OpenCV.

Implementing a Gaussian blur with two-dimensional convolution

A Gaussian blur is implemented by convolving the image with a kernel of Gaussian values. Two- dimensional convolution is something that is used very widely in image processing. Usually, we have a big picture (let's look at a 5 x 5 subsection of that particular image), and we have a kernel (or filter) that is another matrix of a smaller size (in our example, 3 x 3).

In order to get the convolution values, let's suppose that we want to get the value at location (2, 3). We place the kernel centered at the location (2, 3), and we calculate the pointwise product of the overlay matrix (highlighted area, in the following image (red color)) with the kernel and take the overall sum. The resulting value (that is, 158.4) is the value we write on the other matrix at the location (2, 3).

We repeat this process for all elements, and the resulting matrix (the matrix on the right) is the convolution of the kernel with the image. In the following diagram, on the left, you can see the original image with the pixel values in the boxes (values higher than 100). We also see an orange filter with values in the bottom right of each cell (a collection of 0.1 or 0.2 that sum to 1). In the matrix on the right, you see the values when the filter is applied to the image on the left:

Note that, for points on the boundaries, the kernel is not aligned with the matrix, so we have to figure out a strategy to give values for those points. There is no single good strategy that works for everything; some of the approaches are to either extend the border with zeros or extend with border values.

Let's take a look at how to transform a normal picture into a pencil sketch.

Generating a warming and cooling filter

When we perceive images, our brain picks up on a number of subtle clues to infer important details about the scene. For example, in broad daylight, highlights may have a slightly yellowish tint because they are in direct sunlight, whereas shadows may appear slightly bluish because of the ambient light of the blue sky. When we view an image with such color properties, we might immediately think of a sunny day.

This effect is not a mystery to photographers, who sometimes purposely manipulate the white balance of an image to convey a certain mood. Warm colors are generally perceived as more pleasant, whereas cool colors are associated with night and drabness.

To manipulate the perceived color temperature of an image, we will implement a curve filter. These filters control how color transitions appear between different regions of an image, allowing us to subtly shift the color spectrum without adding an unnatural-looking overall tint to the image.

In the next section, we'll look at how to manipulate color using curve shifting.

Cartoonizing an image

Over the past few years, professional cartoonizer software has popped up all over the place. In order to achieve a basic cartoon effect, all we need is a bilateral filter and some edge detection.

The bilateral filter will reduce the color palette or the numbers of colors that are used in the image. This mimics a cartoon drawing, wherein a cartoonist typically has few colors to work with. Then, we can apply edge detection to the resulting image to generate bold silhouettes. The real challenge, however, lies in the computational cost of bilateral filters. We will, therefore, use some tricks to produce an acceptable cartoon effect in real time.

We will adhere to the following procedure to transform an RGB color image into a cartoon:

First, apply a bilateral filter to reduce the color palette of the image.

Then, convert the original color image into grayscale.

After that, apply a

median blur

to reduce image noise.

Use

adaptive thresholding

to detect and emphasize the edges in an edge mask.

Finally, combine the color image from

step 1

with the edge mask from

step 4

In the upcoming sections, we will learn about the previously mentioned steps in detail. First, we'll learn how to use a bilateral filter for edge-aware smoothing.