36,59 €
Create image processing, object detection and face recognition apps by leveraging the power of machine learning and deep learning with OpenCV 4 and Qt 5
Key Features
Book Description
OpenCV and Qt have proven to be a winning combination for developing cross-platform computer vision applications. By leveraging their power, you can create robust applications with both an intuitive graphical user interface (GUI) and high-performance capabilities. This book will help you learn through a variety of real-world projects on image processing, face and text recognition, object detection, and high-performance computing. You'll be able to progressively build on your skills by working on projects of increasing complexity.
You'll begin by creating an image viewer application, building a user interface from scratch by adding menus, performing actions based on key-presses, and applying other functions. As you progress, the book will guide you through using OpenCV image processing and modification functions to edit an image with filters and transformation features. In addition to this, you'll explore the complex motion analysis and facial landmark detection algorithms, which you can use to build security and face detection applications. Finally, you'll learn to use pretrained deep learning models in OpenCV and GPUs to filter images quickly.
By the end of this book, you will have learned how to effectively develop full-fledged computer vision applications with OpenCV and Qt.
What you will learn
Who this book is for
This book is for engineers and developers who are familiar with both Qt and OpenCV frameworks and are capable of creating simple projects using them, but want to build their skills to create professional-level projects using them. Familiarity with the C++ language is a must to follow the example source codes in this book.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 395
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Richa TripathiAcquisition Editor:Sandeep MishraContent Development Editor:Digvijay BagulSenior Editor: Afshaan KhanTechnical Editor:Abin SebastianCopy Editor: Safis EditingProject Coordinator:Carol LewisProofreader: Safis EditingIndexer:Rekha NairProduction Designers:Aparna Bhagat, Alishon Mendonsa
First published: June 2019
Production reference: 1190619
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78953-258-6
www.packtpub.com
To the one who keeps me motivated.
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Zhuo Qingliang (also known as KDr2 online) is presently working at Beijing Paoding Technology Co. Ltd., a start-up Fintech company in China that is dedicated to improving the financial industry by using artificial intelligence technologies. He has over 10 years’ experience in Linux, C, C++, Python, Perl, and Java development. He is interested in programming, doing consulting work, participating in and contributing to the open source community (of course, includes the Julia community).
Christian Stehno studied computer science, receiving his diploma from Oldenburg University, Germany. Since 2000 he worked in computer science, first as a researcher on theoretical computer science at Oldenburg University, later switching to embedded system design at OFFIS institute. In 2010, he started his own company, CoSynth, which develops embedded sensor and camera systems for industrial automation. In addition, he is a long-time member of the Irrlicht 3D engine developer team.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Qt 5 and OpenCV 4 Computer Vision Projects
Dedication
About Packt
Why subscribe?
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Code in Action
Conventions used
Get in touch
Reviews
Building an Image Viewer
Technical requirements
Designing the user interface
Starting the project from scratch
Setting up the full user interface
Implementing the functions for the actions
The Exit action
Opening an image
Zooming in and out
Saving a copy
Navigating in the folder
Responding to hotkeys
Summary
Questions
Editing Images Like a Pro
Technical requirements
The ImageEditor application
Blurring images using OpenCV
Adding the blur action
Building and installing OpenCV from the source
Blurring images
QPixmap, QImage, and Mat
QPixmap
QImage
Mat
Adding features using Qt's plugin mechanism
The plugin interface
Eroding images with ErodePlugin
Loading the plugin into our application
Editing images like a pro
Sharpening images
Cartoon effect
Rotating images
Affine transformation
Summary
Questions
Home Security Applications
Technical requirements
The Gazer application
Starting the project and setting up the UI
Accessing cameras
Listing cameras with Qt
Capturing and playing
Threading and the performance of real-time video processing
Capturing and playing with Qt
Calculating the FPS
Saving videos
Motion analysis with OpenCV
Motion detection with OpenCV
Sending notifications to our mobile phone
Summary
Questions
Fun with Faces
Technical requirements
The Facetious application
From Gazer to Facetious
Taking photos
Detecting faces using cascade classifiers
Detecting facial landmarks
Applying masks to faces
Loading images with the Qt resource system
Drawing masks on the faces
Selecting masks on the UI
Summary
Questions
Optical Character Recognition
Technical requirements
Creating Literacy
Designing the UI
Setting up the UI
OCR with Tesseract
Building Tesseract from the source
Recognizing characters in Literacy
Detecting text areas with OpenCV
Recognizing characters on the screen
Summary
Questions
Object Detection in Real Time
Technical requirements
Detecting objects using OpenCV
Detecting objects using a cascade classifier
Training a cascade classifier
The no-entry traffic sign
The faces of Boston Bulls
Detecting objects using deep learning models
About real time
Summary
Questions
Real-Time Car Detection and Distance Measurement
Technical requirements
Car detection in real time
Distance measurement
Measuring the distance between cars or between the car and  the camera
Measuring the distance between cars in a bird's eye view
Measuring the distance between a car and the camera in the eye-level view
Switching between view modes
Summary
Questions
Using OpenGL for the High-Speed Filtering of Images
Technical requirements
Hello OpenGL
OpenGL in Qt
Filtering images with OpenGL
Drawing images with OpenGL
Filtering images in the fragment shader
Saving the filtered images
Using OpenGL with OpenCV
Summary
Further reading
Assessments
Chapter 1, Building an Image Viewer
Chapter 2, Editing Images Like a Pro
Chapter 3, Home Security Applications
Chapter 4, Fun with Faces
Chapter 5, Optical Character Recognition
Chapter 6, Object Detection in Real Time
Chapter 7, Real-Time Car Detection and Distance Measurement
Other Books You May Enjoy
Leave a review - let other readers know what you think
We are entering the age of intelligence. Today, more and more digital devices and applications deliver features facilitated by artificial intelligence technology. Computer vision technology is an important part of artificial intelligence technology, while the OpenCV library is one of the most comprehensive and mature libraries for computer vision technology. OpenCV goes beyond traditional computer vision technology; it incorporates many other technologies, such as DNN, CUDA, OpenGL, and OpenCL, and is evolving into a more powerful library. But, at the same time, its GUI facility, which is not the main feature of the library, isn't evolving very much. Meanwhile, among the many GUI development libraries and frameworks, there is one that is the best at crossing platforms, has the best ease of use, and has the greatest widget diversity—the Qt library. The goal of this book is to combine OpenCV and Qt to develop fully-fledged applications providing many interesting features.
This book is a practical guide to the OpenCV library and GUI application development with Qt. We'll develop a complete application in each chapter. In each of those applications, a number of computer vision algorithms, Qt widgets, and other facilities will be covered, and a well-designed user interface with functional features will be created.
This book is the result of months of hard work, and it would not have been possible without the invaluable help of the Packt team and the technical reviewer.
This book is designed for all those developers who want to know how to use the OpenCV library to process images and videos, for those who want to learn GUI development with Qt, for those who want to know how to use deep learning in the computer vision domain, and for those who are interested in developing fully-fledged computer vision applications.
Chapter 1, Building an Image Viewer, covers building our first application with Qt. We will build an image viewer, with which we can browse images in a folder. We'll also be able to zoom in or out of the image while viewing it.
Chapter 2, Editing Images Like a Pro, combines the Qt library and the OpenCV library to build a new application, an image editor. In this chapter, we will start by blurring an image to learn how to edit an image. Then, we will learn how to use many other editing effects, such as eroding, sharpening, cartoon effects, and geometric transformation. Each of these features will be incorporated as a Qt plugin, so the plugin mechanism of the Qt library will also be covered.
Chapter 3, Home Security Applications, covers building an application for home security. With a webcam, this application can detect motion and send notifications to a mobile phone upon motion being detected. We will learn how to deal with cameras and videos, how to analyze motion and detect movement with OpenCV, and how to send notifications via IFTTT in this chapter.
Chapter 4, Fun with Faces, explores how to detect faces and facial landmarks with OpenCV. We will build an application to detect faces and facial landmarks in the video in real time in this chapter, and, with the facial landmarks detected, we will apply some funny masks to the faces.
Chapter 5, Optical Character Recognition, introduces the Tesseract library to you. With the help of this library, we will extract text from images such as photos of book pages and scanned documents. In order to extract text from photos of common scenes, we will use a deep learning model named EAST to detect the text areas in photos, and then pass those areas to the Tesseract library. In order to extract text on the screen conveniently, we will also learn how to grab the screen as an image with the Qt library.
Chapter 6, Object Detection in Real Time, shows how to use cascade classifiers to detect objects. Besides using pretrained classifiers, we will also learn how to train classifiers by ourselves. Then, we will introduce how to detect objects by using deep learning models, and a model named YOLOv3 will be used to demonstrate the usage of this approach.
Chapter 7, Real-Time Car Detection and Distance Measurement, covers creating an application to detect cars and measure distances. In the application, we will learn how to measure distances between objects from a bird's eye view and how to measure distances between objects and the camera at eye level view.
Chapter 8, Using OpenGL for High-Speed Filtering of Images, the final chapter of the book, introduces an approach to heterogeneous computing. In this chapter, we first have a brief introduction to the OpenGL specification, and then use it to filter images on the GPU. This is not a typical way to use OpenGL, and it is not typical to do heterogeneous computing either, so we can refer to OpenCL or CUDA if we want to do heterogeneous computing in a mature way.
Appendix A, Assessments, contains answers to all the assessment questions.
In order to achieve the overall outcome of this book, the following are the prerequisites:
You need to have some basic knowledge of C++ and C programming languages.
You need to have Qt v5.0 or above installed.
You need to have a webcam attached to your computer.
Many libraries, such as OpenCV and Tesseract, are also required. The instructions to install them are included in the chapter in which each library is first used.
A knowledge of deep learning and heterogeneous computing will be an advantage in helping to understand some chapters.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Qt-5-and-OpenCV-4-Computer-Vision-Projects. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781789532586_ColorImages.pdf.
Visit the following link to check out videos of the code being run: http://bit.ly/2FfYSDS.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, class or type names. Here is an example: "The Qt project file, ImageViewer.pro, should be renamed ImageEditor.pro. You can do this in your file manager or in a Terminal."
A block of code is set as follows:
QMenu *editMenu; QToolBar *editToolBar; QAction *blurAction;
When we wish to draw your attention to a particular part of a code block, a comment will be appended to end of the lines:
// for editting void blurImage();
Any command-line input or output is written as follows:
$ mkdir Chapter-02
$ cp -r Chapter-01/ImageViewer/ Chapter-02/ImageEditor
$ ls Chapter-02
ImageEditor
$ cd Chapter-02/ImageEditor
$ make clean
$ rm -f ImageViewer
The $ symbol is the shell prompt, and the text after it is a command. The lines that don't start with a $ are the output of the preceding command.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
Computer vision is the technology that enables computers to achieve a high-level understanding of digital images and videos, rather than only treating them as bytes or pixels. It is widely used for scene reconstruction, event detection, video tracking, object recognition, 3D pose estimation, motion estimation, and image restoration.
OpenCV (open source computer vision) is a library that implements almost all computer vision methods and algorithms. Qt is a cross-platform application framework and widget toolkit for creating applications with graphical user interfaces that can run on all major desktop platforms, most embedded platforms, and even mobile platforms.
These two powerful libraries are used together by many developers to create professional software with a solid GUI in industries that benefit from computer vision technology. In this book, we will demonstrate how to build these types of functional application with Qt 5 and OpenCV 4, which has friendly graphical user interfaces and several functions associated with computer vision technology.
In this first chapter, we will start by building a simple GUI application for image viewing with Qt 5.
The following topics will be covered in this chapter as follows:
Designing the user interface
Reading and displaying images with Qt
Zooming in and out of images
Saving a copy of images in any supported format
Responding to hotkeys in a Qt application
Ensure that you at least have Qt version 5 installed and have some basic knowledge of C++ and Qt programming. A compatible C++ compiler is also required, that is, GCC 5 or later on Linux, Clang 7.0 or later on macOS, and MSVC 2015 or later on Microsoft Windows.
Since some pertinent basic knowledge is required as a prerequisite, the Qt installation and compiler environment setup are not included in this book. There are many books, online documents, or tutorials available (for example, GUI Programming with C++ and Qt5, by Lee Zhi Eng, as well as the official Qt library documentation) to help teach these basic configuration processes step by step; users can refer to these by themselves if necessary.
With all of these prerequisites in place, let's start the development of our first application—the simple image viewer.
All the code for this chapter can be found in our code repository at https://github.com/PacktPublishing/Qt-5-and-OpenCV-4-Computer-Vision-Projects/tree/master/Chapter-01.
Check out the following video to see the code in action: http://bit.ly/2KoYWFx
The first part of building an application is to define what the application will do. In this chapter, we will develop an image viewer app. The features it should have are as follows:
Open an image from our hard disk
Zoom in/out
View the previous or next image within the same folder
Save a copy of the current image as another file (with a different path or filename) in another format
There are many image viewer applications that we can follow, such as gThumb on Linux and Preview app on macOS. However, our application will be simpler than those in that we have undertaken some preplanning. This involved the use of Pencil to draw the wireframe of the application prototype.
The following is a wireframe showing our application prototype:
As you can see in the preceding diagram, we have four areas in the main window: the MenuBar, the ToolBar, the Main Area, and the Status Bar.
The menu bar has two menu options on it—the File and View menus. Each menu will have its own set of actions. The File menu consists of the following three actions as follows:
Open
: This option opens an image from the hard disk.
Save as
: This option saves a copy of the current image as another file (with a different path or filename) in any supported format.
Exit
: This option exits the application.
The View menu consists of four actions as follows:
Zoom in
:
This option
zooms in to the image.
Zoom out
:
This option
zooms out of the image.
Prev
:
This option
opens the previous image in the current folder.
Next
:
This option
opens the next image in the current folder.
The toolbar consists of several buttons that can also be found in the menu options. We place them on the toolbar to give the users shortcuts to trigger these actions. So, it is necessary to include all frequently used actions, including the following:
Open
Zoom in
Zoom out
Previous image
Next image
The main area is used to show the image that is opened by the application.
The status bar is used to show some information pertaining to the image that we are viewing, such as its path, dimensions, and its size in bytes.
You can find the source file for this design in our code repository on GitHub: https://github.com/PacktPublishing/Qt-5-and-OpenCV-4-Computer-Vision-Projects. The file merely resides in the root directory of the repository, named WireFrames.epgz. Don't forget that it should be opened using the Pencil application.
In the previous section, we added several actions to the menu and toolbar. However, if we click on these actions, nothing happens. That's because we have not written any handler for them yet. Qt uses a signal and slot connection mechanism to establish the relationship between events and their handlers. When users perform an operation on a widget, a signal of that widget will be emitted. Then, Qt will ascertain whether there is any slot connected with that signal. The slot will be called if it is found. In this section, we will create slots for the actions we have created in the preceding sections and make connections between the signals of the actions to these slots respectively. Also, we will set up some hotkeys for frequently used actions.
Take Exit action as an example. If users click it from the File menu, a signal named triggered will be emitted. So, let's connect this signal to a slot of our application instance in the MainWindow class's member function, createActions:
connect(exitAction, SIGNAL(triggered(bool)), QApplication::instance(), SLOT(quit()));
The connectmethod takes four parameters: the signal sender, the signal, the receiver, and the slot. Once the connection is made, the slot on the receiver will be called as soon as the signal of the sender is emitted. Here, we connect the triggered signal of the Exit action with the quit slot of the application instance to enable the application to exit when we click on the Exit action.
Now, to compile and run, click the Exit item from the File menu. The application will exit as we expect if everything goes well.
OK. We have successfully displayed the image. Now, let's scale it. Here, we take zooming in as an example. With the experience from the preceding actions, we should have a clear idea as to how to do that. First, we declare a private slot, which is named zoomIn, and give its implementation as shown in the following code:
void MainWindow::zoomIn() { imageView->scale(1.2, 1.2); }
Easy, right? Just call the scale method of imageView with a scale rate for the width and a scale rate for the height. Then, we connect the triggered signal of zoomInAction to this slot in the createActions method of the MainWindow class:
connect(zoomInAction, SIGNAL(triggered(bool)), this, SLOT(zoomIn()));
Compile and run the application, open an image with it, and click on the Zoom in button on the toolbar. You will find that the image enlarges to 120% of its current size on each click.
Zooming out just entails scaling the imageView with a rate of less than 1.0. Please try to implement it by yourself. If you find it difficult, you can refer to our code repository on GitHub (https://github.com/PacktPublishing/Qt-5-and-OpenCV-4-Computer-Vision-Projects/tree/master/Chapter-01).
With our application, we can now open an image and scale it for viewing. Next, we will implement the function of the saveAsAction action.
