43,19 €
C++ can make your machine learning models run faster and more efficiently. This handy guide will help you learn the fundamentals of machine learning (ML), showing you how to use C++ libraries to get the most out of your data. This book makes machine learning with C++ for beginners easy with its example-based approach, demonstrating how to implement supervised and unsupervised ML algorithms through real-world examples.
This book will get you hands-on with tuning and optimizing a model for different use cases, assisting you with model selection and the measurement of performance. You’ll cover techniques such as product recommendations, ensemble learning, and anomaly detection using modern C++ libraries such as PyTorch C++ API, Caffe2, Shogun, Shark-ML, mlpack, and dlib. Next, you’ll explore neural networks and deep learning using examples such as image classification and sentiment analysis, which will help you solve various problems. Later, you’ll learn how to handle production and deployment challenges on mobile and cloud platforms, before discovering how to export and import models using the ONNX format.
By the end of this C++ book, you will have real-world machine learning and C++ knowledge, as well as the skills to use C++ to build powerful ML systems.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 609
Veröffentlichungsjahr: 2020
Copyright © 2020 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Sunith ShettyAcquisition Editor:Yogesh DeokarContent Development Editor:Sean LoboSenior Editor: Roshan KumarTechnical Editor:Manikandan KurupCopy Editor:Safis EditingLanguage Support Editors: Jack Cummings and Martin WhittemoreProject Coordinator: Aishwarya MohanProofreader: Safis EditingIndexer:Priyanka DhadkeProduction Designer:Aparna Bhagat
First published: May 2020
Production reference: 1140520
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78995-533-0
www.packt.com
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Kirill Kolodiazhnyi is a seasoned software engineer with expertise in custom software development. He has several years of experience in building machine learning models and data products using C++. He holds a bachelor's degree in computer science from Kharkiv National University of Radio Electronics. He currently works in Kharkiv, Ukraine, where he lives with his wife and daughter.
Davor Lozićis a university lecturer living in Croatia. He likes working on algorithmic/mathematical problems, and reviewing books for Packt makes him read new IT books. He has also worked on Data Analysis with R – Second Edition; Mastering Predictive Analytics with R, Second Edition; R Data Analysis Cookbook, Second Edition; R Deep Learning Projects; Mastering Linux Network Administration; R Machine Learning Projects; Learning Ext JS, Fourth Edition; and R Statistics Cookbook. Davor is a meme master and an Uno master, and he likes cats.
Dr. Ashwin Nanjappa works at NVIDIA on deep learning inference acceleration on GPUs. He has a Ph.D. from the National University of Singapore, where he invented the fastest 3D Delaunay computational geometry algorithms for GPUs. He was a postdoctoral research fellow at the BioInformatics Institute (Singapore), inventing machine learning algorithms for hand and rodent pose estimation using depth cameras. He also worked at Visenze (Singapore) developing computer vision deep learning models for the largest e-commerce portals in the world. He is a published author of two books: Caffe2 Quick Start Guide and Instant GLEW.
Ryan Riley has been involved in the futures and derivatives industry for almost 20 years. He received a bachelor's degree and a master's degree from DePaul University in applied statistics. Doing his coursework in math meant that he had to teach himself how to program, forcing him to read more technical books on programming than he would otherwise have done. Ryan has worked with numerous AI libraries in various languages and is currently using the Caffe2 C++ library to develop and implement futures and derivatives trading strategies at PNT Financial.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Hands-On Machine Learning with C++
About Packt
Why subscribe?
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Overview of Machine Learning
Introduction to Machine Learning with C++
Understanding the fundamentals of ML
Venturing into the techniques of ML
Supervised learning
Unsupervised learning
Dealing with ML models
Model parameter estimation
An overview of linear algebra
Learning the concepts of linear algebra
Basic linear algebra operations
Tensor representation in computing
Linear algebra API samples
Using Eigen
Using xtensor
Using Shark-ML
Using Dlib
An overview of linear regression
Solving linear regression tasks with different libraries
Solving linear regression tasks with Eigen
Solving linear regression tasks with Shogun
Solving linear regression tasks with Shark-ML
Linear regression with Dlib
Summary
Further reading
Data Processing
Technical requirements
Parsing data formats to C++ data structures
Reading CSV files with the Fast-CPP-CSV-Parser library
Preprocessing CSV files
Reading CSV files with the Shark-ML library
Reading CSV files with the Shogun library
Reading CSV files with the Dlib library
Reading JSON files with the RapidJSON library
Writing and reading HDF5 files with the HighFive library
Initializing matrix and tensor objects from C++ data structures
Eigen
Shark-ML
Dlib
Shogun
Manipulating images with the OpenCV and Dlib libraries
Using OpenCV
Using Dlib
Transforming images into matrix or tensor objects of various libraries
Deinterleaving in OpenCV
Deinterleaving in Dlib
Normalizing data
Normalizing with Eigen
Normalizing with Shogun
Normalizing with Dlib
Normalizing with Shark-ML
Summary
Further reading
Measuring Performance and Selecting Models
Technical requirements
Performance metrics for ML models
Regression metrics
Mean squared error and root mean squared error
Mean absolute error
R squared
Adjusted R squared
Classification metrics
Accuracy
Precision and recall
F-score
AUC–ROC
Log-Loss
Understanding the bias and variance characteristics
Bias
Variance
Normal training
Regularization
L1 regularization – Lasso
L2 regularization – Ridge
Data augmentation
Early stopping
Regularization for neural networks
Model selection with the grid search technique
Cross-validation
K-fold cross-validation
Grid search
Shogun example
Shark-ML example
Dlib example
Summary
Further reading
Section 2: Machine Learning Algorithms
Clustering
Technical requirements
Measuring distance in clustering
Euclidean distance
Squared Euclidean distance
Manhattan distance
Chebyshev distance
Types of clustering algorithms
Partition-based clustering algorithms
Distance-based clustering algorithms
Graph theory-based clustering algorithms
Spectral clustering algorithms
Hierarchical clustering algorithms
Density-based clustering algorithms
Model-based clustering algorithms
Examples of using the Shogun library for dealing with the clustering task samples
GMM with Shogun
K-means clustering with Shogun
Hierarchical clustering with Shogun
Examples of using the Shark-ML library for dealing with the clustering task samples
Hierarchical clustering with Shark-ML
K-means clustering with Shark-ML
Examples of using the Dlib library for dealing with the clustering task samples
K-means clustering with Dlib
Spectral clustering with Dlib
Hierarchical clustering with Dlib
Newman modularity-based graph clustering algorithm with Dlib
Chinese Whispers – graph clustering algorithm with Dlib
Plotting data with C++
Summary
Further reading
Anomaly Detection
Technical requirements
Exploring the applications of anomaly detection
Learning approaches for anomaly detection
Detecting anomalies with statistical tests
Detecting anomalies with the Local Outlier Factor method
Detecting anomalies with isolation forest
Detecting anomalies with One-Class SVM (OCSVM)
Density estimation approach (multivariate Gaussian distribution) for anomaly detection
Examples of using different C++ libraries for anomaly detection
C++ implementation of the isolation forest algorithm for anomaly detection
Using the Dlib library for anomaly detection
One-Cass SVM with Dlib
Multivariate Gaussian model with Dlib
OCSVM with Shogun
OCSVM with Shark-ML
Summary
Further reading
Dimensionality Reduction
Technical requirements
An overview of dimension reduction methods
Feature selection methods
Dimensionality reduction methods
Exploring linear methods for dimension reduction
Principal component analysis
Singular value decomposition
Independent component analysis
Linear discriminant analysis
Factor analysis
Multidimensional scaling
Exploring non-linear methods for dimension reduction
Kernel PCA
IsoMap
Sammon mapping
Distributed stochastic neighbor embedding
Autoencoders
Understanding dimension reduction algorithms with various С++ libraries
Using the Dlib library
PCA
Data compression with PCA
LDA
Sammon mapping
Using the Shogun library
PCA
Kernel PCA
MDS
IsoMap
ICA
Factor analysis
t-SNE
Using the Shark-ML library
PCA
LDA
Summary
Further reading
Classification
Technical requirements
An overview of classification methods
Exploring various classification methods
Logistic regression
KRR
SVM
kNN method
Multi-class classification
Examples of using C++ libraries for dealing with the classification task
Using the Shogun library
With logistic regression
With SVMs
With the kNN algorithm
Using the Dlib library
With KRR
With SVM
Using the Shark-ML library
With logistic regression
With SVM
With the kNN algorithm
Summary
Further reading
Recommender Systems
Technical requirements
An overview of recommender system algorithms
Non-personalized recommendations
Content-based recommendations
User-based collaborative filtering
Item-based collaborative filtering
Factorization algorithms
Similarity or preferences correlation
Pearson's correlation coefficient
Spearman's correlation
Cosine distance
Data scaling and standardization
Cold start problem
Relevance of recommendations
Assessing system quality
Understanding collaborative filtering method details
Examples of item-based collaborative filtering with C++
Using the Eigen library
Using the mlpack library
Summary
Further reading
Ensemble Learning
Technical requirements
An overview of ensemble learning
Using a bagging approach for creating ensembles
Using a gradient boosting method for creating ensembles
Using a stacking approach for creating ensembles
Using the random forest method for creating ensembles
Decision tree algorithm overview
Random forest method overview
Examples of using C++ libraries for creating ensembles
Ensembles with Shogun
Using gradient boosting with Shogun
Using random forest with Shogun
Ensembles with Shark-ML
Using random forest with Shark-ML
Using a stacking ensemble with Shark-ML
Summary
Further reading
Section 3: Advanced Examples
Neural Networks for Image Classification
Technical requirements
An overview of neural networks
Neurons
The perceptron and neural networks
Training with the backpropagation method
Backpropagation method modes
Stochastic mode
Batch mode
Mini-batch mode
Backpropagation method problems
The backpropagation method – an example
Loss functions
Activation functions
The stepwise activation function
The linear activation function
The sigmoid activation function
The hyperbolic tangent
Activation function properties
Regularization in neural networks
Different methods for regularization
Neural network initialization
Xavier initialization method
He initialization method
Delving into convolutional networks
Convolution operator
Pooling operation
Receptive field
Convolution network architecture
What is deep learning?
Examples of using C++ libraries to create neural networks
Simple network example for the regression task
Dlib
Shogun
Shark-ML
Architecture definition
Loss function definition
Network initialization
Optimizer configuration
Network training
The complete programming sample
Understanding image classification using the LeNet architecture
Reading the training dataset
Reading dataset files
Reading the image file
Neural network definition
Network training
Summary
Further reading
Sentiment Analysis with Recurrent Neural Networks
Technical requirements
An overview of the RNN concept
Training RNNs using the concept of backpropagation through time
Exploring RNN architectures
LSTM
GRUs
Bidirectional RNN
Multilayer RNN
Understanding natural language processing with RNNs
Word2Vec
GloVe
Sentiment analysis example with an RNN
Summary
Further reading
Section 4: Production and Deployment Challenges
Exporting and Importing Models
Technical requirements
ML model serialization APIs in C++ libraries
Model serialization with Dlib
Model serialization with Shogun
Model serialization with Shark-ML
Model serialization with PyTorch
Neural network initialization
Using the torch::save and torch::load functions
Using PyTorch archive objects
Delving into ONNX format
Loading images into Caffe2 tensors
Reading the class definition file
Summary
Further reading
Deploying Models on Mobile and Cloud Platforms
Technical requirements
Image classification on Android mobile
The mobile version of the PyTorch framework
Using TorchScript for a model snapshot
The Android Studio project
The UI and Java part of the project
The C++ native part of the project
Machine learning in the cloud – using Google Compute Engine
The server
The client
Service deployment
Summary
Further reading
Other Books You May Enjoy
Leave a review - let other readers know what you think
Machine learning (ML) is a popular approach to solve different kinds of problems. ML allows you to deal with various tasks without knowing a direct algorithm to solve them. The key feature of ML algorithms is their ability to learn solutions by using a set of training samples, or even without them. Nowadays, ML is a widespread approach used in various areas of industry. Examples of areas where ML outperforms classical direct algorithms include computer vision, natural language processing, and recommender systems.
This book is a handy guide to help you learn the fundamentals of ML, showing you how to use C++ libraries to get the most out of data. C++ can make your ML models run faster and more efficiently compared to other approaches that use interpreted languages, such as Python. Also, C++ allows you to significantly reduce the negative performance impact of data conversion between different languages used in the ML model because you have direct access to core algorithms and raw data.
You will find this book useful if you want to get started with ML algorithms and techniques using the widespread C++ language. This book also appeals to data analysts, data scientists, and ML developers who are looking to implement different ML models in production using native development toolsets such as the GCC or Clang ecosystems. Working knowledge of the C++ programming language is mandatory to get started with this book.
Hands-On Machine Learning with C++'s example-based approach will show you how to implement supervised and unsupervised ML algorithms with the help of real-world examples. The book also gives you hands-on experience of tuning and optimizing a model for different use cases, helping you to measure performance and model selection. You'll then cover techniques such as object classification and clusterization, product recommendations, ensemble learning, and anomaly detection using modern C++ libraries such as the PyTorch C++ API, Caffe2, Shogun, Shark-ML, mlpack, and dlib. Moving ahead, the chapters will take you through neural networks and deep learning using examples such as image classification and sentiment analysis, which will help you solve a wide range of problems.
Later, you'll learn how to handle production and deployment challenges on mobile and cloud platforms, before discovering how to export and import models using the ONNX format. By the end of this book, you'll have learned how to leverage C++ to build powerful ML systems.
Chapter 1, Introduction to Machine Learning with C++, will guide you through the necessary fundamentals of ML, including linear algebra concepts, ML algorithm types, and their building blocks.
Chapter 2,Data Processing, will show you how to load data from different file formats for ML model training and how to initialize dataset objects in various C++ libraries.
Chapter 3,Measuring Performance and Selecting Models, will show you how to measure the performance of various types of ML models, how to select the best set of hyperparameters to achieve better model performance, and how to use the grid search method in various C++ libraries for model selection.
Chapter 4, Clustering, will discuss algorithms for grouping objects by their essential characteristics, show why we usually use unsupervised algorithms for solving such types of tasks, and lastly, will outline the various types of clustering algorithms, along with their implementations and usage in different C++ libraries.
Chapter 5,Anomaly Detection, will discuss the basics of anomaly and novelty detection tasks and guide you through the different types of anomaly detection algorithms, their implementation, and their usage in various C++ libraries.
Chapter 6,Dimensionality Reduction, will discuss various algorithms for dimensionality reduction that preserve the essential characteristics of data, along with their implementation and usage in various C++ libraries.
Chapter 7,Classification, will show you what a classification task is and how it differs from a clustering task. You will be guided through various classification algorithms, their implementation, and their usage in various C++ libraries.
Chapter 8,Recommender Systems, will give you familiarity with recommender system concepts. You will be shown the different approaches to deal with recommendation tasks, and you will see how to solve such types of tasks using the C++ language.
Chapter 9,Ensemble Learning, will discuss various methods of combining several ML models to get better accuracy and to deal with learning problems. You will encounter ensemble implementations with the usage of different C++ libraries.
Chapter 10,Neural Networks for Image Classification, will give you familiarity with the fundamentals of artificial neural networks. You will encounter the essential building blocks, the required math concepts, and learning algorithms. You will be guided through different C++ libraries that provide functionality for neural network implementations. Also, this chapter will show you the implementation of a deep convolutional network for image classification with the PyTorch library.
Chapter 11,Sentiment Analysis with Recurrent Neural Networks, will guide you through the fundamentals of recurrent neural networks. You will learn about the different types of network cells, the required math concepts, and the differences of this learning algorithm compared to feedforward networks. Also, in this chapter, we will develop a recurrent neural network for sentiment analysis with the PyTorch library.
Chapter 12,Exporting and Importing Models, will show you how to save and load model parameters and architectures using various C++ libraries. Also, you will see how to use the ONNX format to load and use a pre-trained model with the C++ API of the Caffe2 library.
Chapter 13,Deploying Models on Mobile and Cloud Platforms, will guide you through the development of applications for image classification using neural networks for the Android and Google Compute Engine platforms.
To be able to compile and run the examples included in this book, you will need to configure a particular development environment. All code examples have been tested with the Arch and Ubuntu 18.04 Linux distributions. The following list outlines the packages you'll need to install on the Ubuntu platform:
build-essential
unzip
git
cmake
cmake-curses-gui
python
python-pip
libblas-dev
libopenblas-dev
libatlas-base-dev
liblapack-dev
libboost-all-dev
libopencv-core3.2
libopencv-imgproc3.2
libopencv-dev
libopencv-highgui3.2
libopencv-highgui-dev
protobuf-compiler
libprotobuf-dev
libhdf5-dev
libjson-c-dev
libx11-dev
openjdk-8-jdk
wget
ninja-build
Also, you need to install the following additional packages for Python:
pyyaml
typing
Besides the development environment, you'll have to check out requisite third-party libraries' source code samples and build them. Most of these libraries are actively developed and don't have strict releases, so it's easier to check out a particular commit from the development tree and build it than downloading the latest official release. The following table shows you the libraries you have to check out, their repository URLs, and the hash number of the commit to check out:
Library repository
Branch name
Commit
https://github.com/shogun-toolbox/shogun
master
f7255cf2cc6b5116e50840816d70d21e7cc039bb
https://github.com/Shark-ML/Shark
master
221c1f2e8abfffadbf3c5ef7cf324bc6dc9b4315
https://gitlab.com/conradsnicta/armadillo-code
9.500.x
442d52ba052115b32035a6e7dc6587bb6a462dec
https://github.com/davisking/dlib
v19.15
929c630b381d444bbf5d7aa622e3decc7785ddb2
https://github.com/eigenteam/eigen-git-mirror
3.3.7
cf794d3b741a6278df169e58461f8529f43bce5d
https://github.com/mlpack/mlpack
master
e2f696cfd5b7ccda2d3af1c7c728483ea6591718
https://github.com/Kolkir/plotcpp
master
c86bd4f5d9029986f0d5f368450d79f0dd32c7e4
https://github.com/pytorch/pytorch
v1.2.0
8554416a199c4cec01c60c7015d8301d2bb39b64
https://github.com/xtensor-stack/xtensor
master
02d8039a58828db1ffdd2c60fb9b378131c295a2
https://github.com/xtensor-stack/xtensor-blas
master
89d9df93ff7306c32997e8bb8b1ff02534d7df2e
https://github.com/xtensor-stack/xtl
master
03a6827c9e402736506f3ded754e890b3ea28a98
https://github.com/opencv/opencv_contrib/releases/tag/3.3.0
3.3.0
https://github.com/ben-strasser/fast-cpp-csv-parser
master
3b439a664090681931c6ace78dcedac6d3a3907e
https://github.com/Tencent/rapidjson
master
73063f5002612c6bf64fe24f851cd5cc0d83eef9
Also, for the last chapter, you'll have to install the Android Studio IDE. You can download it from the official site at https://developer.android.com/studio. Besides the IDE, you'll also need to install and configure the Android SDK. The respective example in this book was developed and tested with this SDK, which can be downloaded from https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip. To configure this SDK, you have to unzip it and install particular packages. The following script shows how to do it:
mkdir /android
cd /android
wget https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip
unzip sdk-tools-linux-4333796.zip
yes | ./tools/bin/sdkmanager --licenses
yes | ./tools/bin/sdkmanager "platform-tools"
yes | ./tools/bin/sdkmanager "platforms;android-25"
yes | ./tools/bin/sdkmanager "build-tools;25.0.2"
yes | ./tools/bin/sdkmanager "system-images;android-25;google_apis;armeabi-v7a"
yes | ./tools/bin/sdkmanager --install "ndk;20.0.5594570"
export ANDROID_NDK=/android/ndk/20.0.5594570
export ANDROID_ABI='armeabi-v7a'
Another way to configure the development environment is through the use of Docker. Docker allows you to configure a lightweight virtual machine with particular components. You can install Docker from the official Ubuntu package repository. Then, use the scripts provided with this book to automatically configure the environment. You will find the docker folder in the examples package. The following steps show how to use Docker configuration scripts:
Run the following commands to create the image, run it, and configure the environment:
cd docker
docker build -t buildenv:1.0 .
docker run -it buildenv:1.0 bash
cd /development
./install_env.sh
./install_android.sh
exit
Use the following command to save our Docker container with the configured libraries and packages into a new Docker image:
docker commit [container id]
Use the following command to rename the updated Docker image:
docker tag [image id] [new name]
Use the following command to start a new Docker container and share the book examples sources to it:
docker run -it -v [host_examples_path]:[container_examples_path] [tag name] bash
After running the preceding command, you will be in the command-line environment with the necessary configured packages, compiled third-party libraries, and with access to the programming examples package. You can use this environment to compile and run the code examples in this book. Each programming example is configured to use the CMake build system so you will be able to build them all in the same way. The following script shows a possible scenario of building a code example:
cd [example folder name]
mkdir build
cd build
cmake ..
cmake --build . --target all
Also, you can configure your local machine environment to share X Server with a Docker container to be able to run graphical UI applications from this container. It will allow you to use, for example, the Android Studio IDE or a C++ IDE (such as Qt Creator) from the Docker container, without local installation. The following script shows how to do this:
xhost +local:root
docker run --net=host -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -it -v [host_examples_path]:[container_examples_path] [tag name] bash
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the following section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
To be more comfortable with understanding and building the code examples, we recommend you carefully read the documentation for each third-party library, and take some time to learn the basics of the Docker system and of development for the Android platform. Also, we assume that you have sufficient working knowledge of the C++ language and compilers, and that you are familiar with the CMake build system.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
Support
tab.
Click on
Code Downloads
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Hands-On-Machine-Learning-with-CPP. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here:http://www.packtpub.com/sites/default/files/downloads/9781789955330_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText:Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles.Here is an example:"We downloaded a pre-trained model with the torch.hub.load() function."
A block of code is set as follows:
class Network { public: Network(const std::string& snapshot_path, const std::string& synset_path, torch::DeviceType device_type); std::string Classify(const at::Tensor& image); private: torch::DeviceType device_type_; Classes classes_; torch::jit::script::Module model_;};
Any command-line input or output is written as follows:
cd ~/[DEST_PATH]/server
mkdir build
cd build
cmake .. -DCMAKE_PREFIX_PATH=~/dev/server/third-party/libtorch
cmake --build . --target all
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: " Start it by clicking the Start button at the top of the page."
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book,mention the book title in the subject of your message and email us [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visitwww.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us [email protected] a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visitauthors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
In this section, we will delve into the basics of machine learning with the help of examples in C++ and various machine learning frameworks. We'll demonstrate how to load data from various file formats and describe model performance measuring techniques and the best model selection approaches.
This section comprises the following chapters:
Chapter 1
,
Introduction to Machine Learning with C++
Chapter 2
,
Data Processing
Chapter 3
,
Measuring Performance and Selecting Models
There are different approaches to make computers solve tasks. One of them is to define an explicit algorithm, and another one is to use implicit strategies based on mathematical and statistical methods. Machine Learning (ML) is one of the implicit methods that uses mathematical and statistical approaches to solve tasks. It is an actively growing discipline, and a lot of scientists and researchers find it to be one of the best ways to move forward toward systems acting as human-level artificial intelligence(AI).
In general, ML approaches have the idea of searching patterns in a given dataset as their basis. Consider a recommendation system for a news feed, which provides the user with a personalized feed based on their previous activity or preferences. The software gathers information about the type of news article the user reads and calculates some statistics. For example, it could be the frequency of some topics appearing in a set of news articles. Then, it performs some predictive analytics, identifies general patterns, and uses them to populate the user's news feed. Such systems periodically track a user's activity, and update the dataset and calculate new trends for recommendations.
There are many areas where ML has started to play an important role. It is used for solving enterprise business tasks as well as for scientific researches. In customer relationship management (CRM) systems, ML models are used to analyze sales team activity, to help them to process the most important requests first. ML models are used in business intelligence (BI) and analytics to find essential data points. Human resource (HR) departments use ML models to analyze their employees' characteristics in order to identify the most effective ones and use this information when searching applicants for open positions.
A fast-growing direction of research is self-driving cars, and deep learning neural networks are used extensively in this area. They are used in computer vision systems for object identification as well as for navigation and steering systems, which are necessary for car driving.
Another popular use of ML systems is electronic personal assistants, such as Siri from Apple or Alexa from Amazon. Such products also use deep learning models to analyze natural speech or written text to process users' requests and make a natural response in a relevant context. Such requests can activate music players with preferred songs, as well as update a user's personal schedule or book flight tickets.
This chapter describes what ML is and which tasks can be solved with ML, and discusses different approaches used in ML. It aims to show the minimally required math to start implementing ML algorithms. It also covers how to perform basic linear algebra operations in libraries such as Eigen, xtensor, Shark-ML, Shogun, and Dlib, and also explains the linear regression task as an example.
The following topics will be covered in this chapter:
Understanding the fundamentals of ML
An overview of linear algebra
An overview of a linear regression example
There are different approaches to create and train ML models. In this section, we show what these approaches are and how they differ. Apart from the approach we use to create a ML model, there are also parameters that manage how this model behaves in the training and evaluation processes. Model parameters can be divided into two distinct groups, which should be configured in different ways. The last crucial part of the ML process is a technique that we use to train a model. Usually, the training technique uses some numerical optimization algorithm that finds the minimal value of a target function. In ML, the target function is usually called a loss function and is used for penalizing the training algorithm when it makes errors. We discuss these concepts more precisely in the following sections.
We can divide ML approaches into two techniques, as follows:
Supervised learning is an approach based on the use of labeled data. Labeled data is a set of known data samples with corresponding known target outputs. Such a kind of data is used to build a model that can predict future outputs.
Unsupervised learning is an approach that does not require labeled data and can search hidden patterns and structures in an arbitrary kind of data.
Let's have a look at each of the techniques in detail.
Supervised ML algorithms usually take a limited set of labeled data and build models that can make reasonable predictions for new data. We can split supervised learning algorithms into two main parts, classification and regression techniques, described as follows:
Classification models predict some finite and distinct types of categories—this could be a label that identifies if an email is spam or not, or whether an image contains a human face or not. Classification models are applied in speech and text recognition, object identification on images, credit scoring, and others. Typical algorithms for creating classification models are
Support Vector Machine
(
SVM
), decision tree approaches,
k-nearest neighbors
(
KNN
), logistic regression, Naive Bayes, and neural networks. The following chapters describe the details of some of these algorithms.
Regression models predict continuous responses such as changes in temperature or values of currency exchange rates. Regression models are applied in algorithmic trading, forecasting of electricity load, revenue prediction, and others. Creating a regression model usually makes sense if the output of the given labeled data is real numbers. Typical algorithms for creating regression models are linear and multivariate regressions, polynomial regression models, and stepwise regressions. We can use decision tree techniques and neural networks to create regression models too. The following chapters describe the details of some of these algorithms.
Unsupervised learning algorithms do not use labeled datasets. They create models that use intrinsic relations in data to find hidden patterns that they can use for making predictions. The most well-known unsupervised learning technique is clustering. Clustering involves dividing a given set of data in a limited number of groups according to some intrinsic properties of data items. Clustering is applied in market researches, different types of exploratory analysis, deoxyribonucleic acid (DNA) analysis, image segmentation, and object detection. Typical algorithms for creating models for performing clustering are k-means, k-medoids, Gaussian mixture models, hierarchical clustering, and hidden Markov models. Some of these algorithms are explained in the following chapters of this book.
We can interpret ML models as functions that take different types of parameters. Such functions provide outputs for given inputs based on the values of these parameters. Developers can configure the behavior of ML models for solving problems by adjusting model parameters. Training a ML model can usually be treated as a process of searching the best combination of its parameters. We can split the ML model's parameters into two types. The first type consists of parameters internal to the model, and we can estimate their values from the training (input) data. The second type consists of parameters external to the model, and we cannot estimate their values from training data. Parameters that are external to the model are usually called hyperparameters.
Internal parameters have the following characteristics:
They are necessary for making predictions.
They define the quality of the model on the given problem.
We can learn them from training data.
Usually, they are a part of the model.
If the model contains a fixed number of internal parameters, it is called parametric. Otherwise, we can classify it as non-parametric.
Examples of internal parameters are as follows:
Weights of
artificial neural networks
(
ANNs
)
Support vector values for SVM models
Polynomial coefficients for linear regression or logistic regression
On the other hand, hyperparameters have the following characteristics:
They are used to configure algorithms that estimate model parameters.
The practitioner usually specifies them.
Their estimation is often based on using heuristics.
They are specific to a concrete modeling problem.
It is hard to know the best values for a model's hyperparameters for a specific problem. Also, practitioners usually need to perform additional research on how to tune required hyperparameters so that a model or a training algorithm behaves in the best way. Practitioners use rules of thumb, copying values from similar projects, as well as special techniques such as grid search for hyperparameter estimation.
Examples of hyperparameters are as follows:
C and sigma parameters used in the SVM algorithm for a classification quality configuration
The learning rate parameter that is used in the neural network training process to configure algorithm convergence
The
k
value that is used in the KNN algorithm to configure the number of neighbors
Model parameter estimation usually uses some optimization algorithm. The speed and quality of the resulting model can significantly depend on the optimization algorithm chosen. Research on optimization algorithms is a popular topic in industry, as well as in academia. ML often uses optimization techniques and algorithms based on the optimization of a loss function. A function that evaluates how well a model predicts on the data is called a loss function. If predictions are very different from the target outputs, the loss function will return a value that can be interpreted as a bad one, usually a large number. In such a way, the loss function penalizes an optimization algorithm when it moves in the wrong direction. So, the general idea is to minimize the value of the loss function to reduce penalties. There is no one universal loss function for optimization algorithms. Different factors determine how to choose a loss function. Examples of such factors are as follows:
Specifics of the given problem—for example, if it is a regression or a classification model
Ease of calculating derivatives
Percentage of outliers in the dataset
In ML, the term optimizer is used to define an algorithm that connects a loss function and a technique for updating model parameters in response to the values of the loss function. So, optimizers tune ML models to predict target values for new data in the most accurate way by fitting model parameters. There are many optimizers: Gradient Descent, Adagrad, RMSProp, Adam, and others. Moreover, developing new optimizers is an active area of research. For example, there is the ML and Optimization research group at Microsoft (located in Redmond) whose research areas include combinatorial optimization, convex and non-convex optimization, and their application in ML and AI. Other companies in the industry also have similar research groups; there are many publications from Facebook Research, Amazon Research, and OpenAI groups.
The concepts of linear algebra are essential for understanding the theory behind ML because they help us understand how ML algorithms work under the hood. Also, most ML algorithm definitions use linear algebra terms.
Linear algebra is not only a handy mathematical instrument, but also the concepts of linear algebra can be very efficiently implemented with modern computer architectures. The rise of ML, and especially deep learning, began after significant performance improvement of the modern Graphics Processing Unit (GPU). GPUs were initially designed to work with linear algebra concepts and massive parallel computations used in computer games. After that, special libraries were created to work with general linear algebra concepts. Examples of libraries that implement basic linear algebra routines are Cuda and OpenCL, and one example of a specialized linear algebra library is cuBLAS. Moreover, it became more common to use general-purpose graphics processing units (GPGPUs) because these turn the computational power of a modern GPU into a powerful general-purpose computing resource.
Also, Central Processing Units (CPUs) have instruction sets specially designed for simultaneous numerical computations. Such computations are called vectorized, and common vectorized instruction sets are AVx, SSE, and MMx. There is also a term Single Instruction Multiple Data (SIMD) for these instruction sets. Many numeric linear algebra libraries, such as Eigen, xtensor, VienaCL, and others, use them to improve computational performance.
Linear algebra is a big area. It is the section of algebra that studies objects of a linear nature: vector (or linear) spaces, linear representations, and systems of linear equations. The main tools used in linear algebra are determinants, matrices, conjugation, and tensor calculus.
To understand ML algorithms, we only need a small set of linear algebra concepts. However, to do researches on new ML algorithms, a practitioner should have a deep understanding of linear algebra and calculus.
The following list contains the most valuable linear algebra concepts for understanding ML algorithms:
Scalar:
This is a single number.
Vector:
This
is an array of ordered numbers. Each element has a distinct index. Notation for vectors is a bold lowercase typeface for names and an italic typeface with a subscript for elements, as shown in the following example:
Matrix:
This
is a two-dimensional array of numbers. Each element has a distinct pair of indices. Notation for matrices is a bold uppercase typeface for names and an italic but not bold typeface with a comma-separated list of indices in subscript for elements, as shown in the following example:
Tensor:
This i
s an array of numbers arranged in a multidimensional regular grid, and represents generalizations of matrices. It is like a multidimensional matrix. For example, tensor
A
with dimensions 2 x 2 x 2 can look like this:
Linear algebra libraries and ML frameworks usually use the concept of a tensor instead of a matrix because they implement general algorithms, and a matrix is just a special case of a tensor with two dimensions. Also, we can consider a vector as a matrix of size n x 1.
We can represent tensor objects in computer memory in different ways. The most obvious method is a simple linear array in computer memory (random-access memory, or RAM). However, the linear array is also the most computationally effective data structure for modern CPUs. There are two standard practices to organize tensors with a linear array in memory: row-major ordering and column-major ordering. In row-major ordering, we place consecutive elements of a row in linear order one after the other, and each row is also placed after the end of the previous one. In column-major ordering, we do the same but with the column elements. Data layouts have a significant impact on computational performance because the speed of traversing an array relies on modern CPU architectures that work with sequential data more efficiently than with non-sequential data. CPU caching effects are the reasons for such behavior. Also, a contiguous data layout makes it possible to use SIMD vectorized instructions that work with sequential data more efficiently, and we can use them as a type of parallel processing.
Different libraries, even in the same programming language, can use different ordering. For example, Eigen uses column-major ordering, but PyTorch uses row-major ordering. So, developers should be aware of internal tensor representation in libraries they use, and also take care of this when performing data loading or implementing algorithms from scratch.
Consider the following matrix:
Then, in the row-major data layout, members of the matrix will have the following layout in memory:
In the case of the column-major data layout, order layout will be the next, as shown here:
Consider some C++ linear algebra APIs (short for Application Program Interface), and look at how we can use them for creating linear algebra primitives and perform algebra operations with them.
