E-Book
43,19 €

Hands-On Machine Learning with C++ E-Book

Kirill Kolodiazhnyi

0,0

43,19 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

C++ can make your machine learning models run faster and more efficiently. This handy guide will help you learn the fundamentals of machine learning (ML), showing you how to use C++ libraries to get the most out of your data. This book makes machine learning with C++ for beginners easy with its example-based approach, demonstrating how to implement supervised and unsupervised ML algorithms through real-world examples.

This book will get you hands-on with tuning and optimizing a model for different use cases, assisting you with model selection and the measurement of performance. You’ll cover techniques such as product recommendations, ensemble learning, and anomaly detection using modern C++ libraries such as PyTorch C++ API, Caffe2, Shogun, Shark-ML, mlpack, and dlib. Next, you’ll explore neural networks and deep learning using examples such as image classification and sentiment analysis, which will help you solve various problems. Later, you’ll learn how to handle production and deployment challenges on mobile and cloud platforms, before discovering how to export and import models using the ONNX format.

By the end of this C++ book, you will have real-world machine learning and C++ knowledge, as well as the skills to use C++ to build powerful ML systems.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 609

Veröffentlichungsjahr: 2020

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Hands-On Machine Learning with C++

Build, train, and deploy end-to-end machine learning and deep learning pipelines

Kirill Kolodiazhnyi

BIRMINGHAM - MUMBAI

Hands-On Machine Learning with C++

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Sunith ShettyAcquisition Editor:Yogesh DeokarContent Development Editor:Sean LoboSenior Editor: Roshan KumarTechnical Editor:Manikandan KurupCopy Editor:Safis EditingLanguage Support Editors: Jack Cummings and Martin WhittemoreProject Coordinator: Aishwarya MohanProofreader: Safis EditingIndexer:Priyanka DhadkeProduction Designer:Aparna Bhagat

First published: May 2020

Production reference: 1140520

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78995-533-0

www.packt.com

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Kirill Kolodiazhnyi is a seasoned software engineer with expertise in custom software development. He has several years of experience in building machine learning models and data products using C++. He holds a bachelor's degree in computer science from Kharkiv National University of Radio Electronics. He currently works in Kharkiv, Ukraine, where he lives with his wife and daughter.

About the reviewers

Davor Lozićis a university lecturer living in Croatia. He likes working on algorithmic/mathematical problems, and reviewing books for Packt makes him read new IT books. He has also worked on Data Analysis with R – Second Edition; Mastering Predictive Analytics with R, Second Edition; R Data Analysis Cookbook, Second Edition; R Deep Learning Projects; Mastering Linux Network Administration; R Machine Learning Projects; Learning Ext JS, Fourth Edition; and R Statistics Cookbook. Davor is a meme master and an Uno master, and he likes cats.

Dr. Ashwin Nanjappa works at NVIDIA on deep learning inference acceleration on GPUs. He has a Ph.D. from the National University of Singapore, where he invented the fastest 3D Delaunay computational geometry algorithms for GPUs. He was a postdoctoral research fellow at the BioInformatics Institute (Singapore), inventing machine learning algorithms for hand and rodent pose estimation using depth cameras. He also worked at Visenze (Singapore) developing computer vision deep learning models for the largest e-commerce portals in the world. He is a published author of two books: Caffe2 Quick Start Guide and Instant GLEW.

Ryan Riley has been involved in the futures and derivatives industry for almost 20 years. He received a bachelor's degree and a master's degree from DePaul University in applied statistics. Doing his coursework in math meant that he had to teach himself how to program, forcing him to read more technical books on programming than he would otherwise have done. Ryan has worked with numerous AI libraries in various languages and is currently using the Caffe2 C++ library to develop and implement futures and derivatives trading strategies at PNT Financial.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Hands-On Machine Learning with C++

About Packt

Why subscribe?

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1: Overview of Machine Learning

Introduction to Machine Learning with C++

Understanding the fundamentals of ML

Venturing into the techniques of ML

Supervised learning

Unsupervised learning

Dealing with ML models

Model parameter estimation

An overview of linear algebra

Learning the concepts of linear algebra

Basic linear algebra operations

Tensor representation in computing

Linear algebra API samples

Using Eigen

Using xtensor

Using Shark-ML

Using Dlib

An overview of linear regression

Solving linear regression tasks with different libraries

Solving linear regression tasks with Eigen

Solving linear regression tasks with Shogun

Solving linear regression tasks with Shark-ML

Linear regression with Dlib

Summary

Further reading

Data Processing

Technical requirements

Parsing data formats to C++ data structures

Reading CSV files with the Fast-CPP-CSV-Parser library

Preprocessing CSV files

Reading CSV files with the Shark-ML library

Reading CSV files with the Shogun library

Reading CSV files with the Dlib library

Reading JSON files with the RapidJSON library

Writing and reading HDF5 files with the HighFive library

Initializing matrix and tensor objects from C++ data structures

Eigen

Shark-ML

Dlib

Shogun

Manipulating images with the OpenCV and Dlib libraries

Using OpenCV

Using Dlib

Transforming images into matrix or tensor objects of various libraries

Deinterleaving in OpenCV

Deinterleaving in Dlib

Normalizing data

Normalizing with Eigen

Normalizing with Shogun

Normalizing with Dlib

Normalizing with Shark-ML

Summary

Further reading

Measuring Performance and Selecting Models

Technical requirements

Performance metrics for ML models

Regression metrics

Mean squared error and root mean squared error

Mean absolute error

R squared

Adjusted R squared

Classification metrics

Accuracy

Precision and recall

F-score

AUC–ROC

Log-Loss

Understanding the bias and variance characteristics

Bias

Variance

Normal training

Regularization

L1 regularization – Lasso

L2 regularization – Ridge

Data augmentation

Early stopping

Regularization for neural networks

Model selection with the grid search technique

Cross-validation

K-fold cross-validation

Grid search

Shogun example

Shark-ML example

Dlib example

Summary

Further reading

Section 2: Machine Learning Algorithms

Clustering

Technical requirements

Measuring distance in clustering

Euclidean distance

Squared Euclidean distance

Manhattan distance

Chebyshev distance

Types of clustering algorithms

Partition-based clustering algorithms

Distance-based clustering algorithms

Graph theory-based clustering algorithms

Spectral clustering algorithms

Hierarchical clustering algorithms

Density-based clustering algorithms

Model-based clustering algorithms

Examples of using the Shogun library for dealing with the clustering task samples

GMM with Shogun

K-means clustering with Shogun

Hierarchical clustering with Shogun

Examples of using the Shark-ML library for dealing with the clustering task samples

Hierarchical clustering with Shark-ML

K-means clustering with Shark-ML

Examples of using the Dlib library for dealing with the clustering task samples

K-means clustering with Dlib

Spectral clustering with Dlib

Hierarchical clustering with Dlib

Newman modularity-based graph clustering algorithm with Dlib

Chinese Whispers – graph clustering algorithm with Dlib

Plotting data with C++

Summary

Further reading

Anomaly Detection

Technical requirements

Exploring the applications of anomaly detection

Learning approaches for anomaly detection

Detecting anomalies with statistical tests

Detecting anomalies with the Local Outlier Factor method

Detecting anomalies with isolation forest

Detecting anomalies with One-Class SVM (OCSVM)

Density estimation approach (multivariate Gaussian distribution) for anomaly detection

Examples of using different C++ libraries for anomaly detection

C++ implementation of the isolation forest algorithm for anomaly detection

Using the Dlib library for anomaly detection

One-Cass SVM with Dlib

Multivariate Gaussian model with Dlib

OCSVM with Shogun

OCSVM with Shark-ML

Summary

Further reading

Dimensionality Reduction

Technical requirements

An overview of dimension reduction methods

Feature selection methods

Dimensionality reduction methods

Exploring linear methods for dimension reduction

Principal component analysis

Singular value decomposition

Independent component analysis

Linear discriminant analysis

Factor analysis

Multidimensional scaling

Exploring non-linear methods for dimension reduction

Kernel PCA

IsoMap

Sammon mapping

Distributed stochastic neighbor embedding

Autoencoders

Understanding dimension reduction algorithms with various С++ libraries

Using the Dlib library

PCA

Data compression with PCA

LDA

Sammon mapping

Using the Shogun library

PCA

Kernel PCA

MDS

IsoMap

ICA

Factor analysis

t-SNE

Using the Shark-ML library

PCA

LDA

Summary

Further reading

Classification

Technical requirements

An overview of classification methods

Exploring various classification methods

Logistic regression

KRR

SVM

kNN method

Multi-class classification

Examples of using C++ libraries for dealing with the classification task

Using the Shogun library

With logistic regression

With SVMs

With the kNN algorithm

Using the Dlib library

With KRR

With SVM

Using the Shark-ML library

With logistic regression

With SVM

With the kNN algorithm

Summary

Further reading

Recommender Systems

Technical requirements

An overview of recommender system algorithms

Non-personalized recommendations

Content-based recommendations

User-based collaborative filtering

Item-based collaborative filtering

Factorization algorithms

Similarity or preferences correlation

Pearson's correlation coefficient

Spearman's correlation

Cosine distance

Data scaling and standardization

Cold start problem

Relevance of recommendations

Assessing system quality

Understanding collaborative filtering method details

Examples of item-based collaborative filtering with C++

Using the Eigen library

Using the mlpack library

Summary

Further reading

Ensemble Learning

Technical requirements

An overview of ensemble learning

Using a bagging approach for creating ensembles

Using a gradient boosting method for creating ensembles

Using a stacking approach for creating ensembles

Using the random forest method for creating ensembles

Decision tree algorithm overview

Random forest method overview

Examples of using C++ libraries for creating ensembles

Ensembles with Shogun

Using gradient boosting with Shogun

Using random forest with Shogun

Ensembles with Shark-ML

Using random forest with Shark-ML

Using a stacking ensemble with Shark-ML

Summary

Further reading

Section 3: Advanced Examples

Neural Networks for Image Classification

Technical requirements

An overview of neural networks

Neurons

The perceptron and neural networks

Training with the backpropagation method

Backpropagation method modes

Stochastic mode

Batch mode

Mini-batch mode

Backpropagation method problems

The backpropagation method – an example

Loss functions

Activation functions

The stepwise activation function

The linear activation function

The sigmoid activation function

The hyperbolic tangent

Activation function properties

Regularization in neural networks

Different methods for regularization

Neural network initialization

Xavier initialization method

He initialization method

Delving into convolutional networks

Convolution operator

Pooling operation

Receptive field

Convolution network architecture

What is deep learning?

Examples of using C++ libraries to create neural networks

Simple network example for the regression task

Dlib

Shogun

Shark-ML

Architecture definition

Loss function definition

Network initialization

Optimizer configuration

Network training

The complete programming sample

Understanding image classification using the LeNet architecture

Reading the training dataset

Reading dataset files

Reading the image file

Neural network definition

Network training

Summary

Further reading

Sentiment Analysis with Recurrent Neural Networks

Technical requirements

An overview of the RNN concept

Training RNNs using the concept of backpropagation through time

Exploring RNN architectures

LSTM

GRUs

Bidirectional RNN

Multilayer RNN

Understanding natural language processing with RNNs

Word2Vec

GloVe

Sentiment analysis example with an RNN

Summary

Further reading

Section 4: Production and Deployment Challenges

Exporting and Importing Models

Technical requirements

ML model serialization APIs in C++ libraries

Model serialization with Dlib

Model serialization with Shogun

Model serialization with Shark-ML

Model serialization with PyTorch

Neural network initialization

Using the torch::save and torch::load functions

Using PyTorch archive objects

Delving into ONNX format

Loading images into Caffe2 tensors

Reading the class definition file

Summary

Further reading

Deploying Models on Mobile and Cloud Platforms

Technical requirements

Image classification on Android mobile

The mobile version of the PyTorch framework

Using TorchScript for a model snapshot

The Android Studio project

The UI and Java part of the project

The C++ native part of the project

Machine learning in the cloud – using Google Compute Engine

The server

The client

Service deployment

Summary

Further reading

Other Books You May Enjoy

Leave a review - let other readers know what you think

Preface

Machine learning (ML) is a popular approach to solve different kinds of problems. ML allows you to deal with various tasks without knowing a direct algorithm to solve them. The key feature of ML algorithms is their ability to learn solutions by using a set of training samples, or even without them. Nowadays, ML is a widespread approach used in various areas of industry. Examples of areas where ML outperforms classical direct algorithms include computer vision, natural language processing, and recommender systems.

This book is a handy guide to help you learn the fundamentals of ML, showing you how to use C++ libraries to get the most out of data. C++ can make your ML models run faster and more efficiently compared to other approaches that use interpreted languages, such as Python. Also, C++ allows you to significantly reduce the negative performance impact of data conversion between different languages used in the ML model because you have direct access to core algorithms and raw data.

Who this book is for

You will find this book useful if you want to get started with ML algorithms and techniques using the widespread C++ language. This book also appeals to data analysts, data scientists, and ML developers who are looking to implement different ML models in production using native development toolsets such as the GCC or Clang ecosystems. Working knowledge of the C++ programming language is mandatory to get started with this book.

What this book covers

Hands-On Machine Learning with C++'s example-based approach will show you how to implement supervised and unsupervised ML algorithms with the help of real-world examples. The book also gives you hands-on experience of tuning and optimizing a model for different use cases, helping you to measure performance and model selection. You'll then cover techniques such as object classification and clusterization, product recommendations, ensemble learning, and anomaly detection using modern C++ libraries such as the PyTorch C++ API, Caffe2, Shogun, Shark-ML, mlpack, and dlib. Moving ahead, the chapters will take you through neural networks and deep learning using examples such as image classification and sentiment analysis, which will help you solve a wide range of problems.

Later, you'll learn how to handle production and deployment challenges on mobile and cloud platforms, before discovering how to export and import models using the ONNX format. By the end of this book, you'll have learned how to leverage C++ to build powerful ML systems.

Chapter 1, Introduction to Machine Learning with C++, will guide you through the necessary fundamentals of ML, including linear algebra concepts, ML algorithm types, and their building blocks.

Chapter 2,Data Processing, will show you how to load data from different file formats for ML model training and how to initialize dataset objects in various C++ libraries.

Chapter 3,Measuring Performance and Selecting Models, will show you how to measure the performance of various types of ML models, how to select the best set of hyperparameters to achieve better model performance, and how to use the grid search method in various C++ libraries for model selection.

Chapter 4, Clustering, will discuss algorithms for grouping objects by their essential characteristics, show why we usually use unsupervised algorithms for solving such types of tasks, and lastly, will outline the various types of clustering algorithms, along with their implementations and usage in different C++ libraries.

Chapter 5,Anomaly Detection, will discuss the basics of anomaly and novelty detection tasks and guide you through the different types of anomaly detection algorithms, their implementation, and their usage in various C++ libraries.

Chapter 6,Dimensionality Reduction, will discuss various algorithms for dimensionality reduction that preserve the essential characteristics of data, along with their implementation and usage in various C++ libraries.

Chapter 7,Classification, will show you what a classification task is and how it differs from a clustering task. You will be guided through various classification algorithms, their implementation, and their usage in various C++ libraries.

Chapter 8,Recommender Systems, will give you familiarity with recommender system concepts. You will be shown the different approaches to deal with recommendation tasks, and you will see how to solve such types of tasks using the C++ language.

Chapter 9,Ensemble Learning, will discuss various methods of combining several ML models to get better accuracy and to deal with learning problems. You will encounter ensemble implementations with the usage of different C++ libraries.

Chapter 10,Neural Networks for Image Classification, will give you familiarity with the fundamentals of artificial neural networks. You will encounter the essential building blocks, the required math concepts, and learning algorithms. You will be guided through different C++ libraries that provide functionality for neural network implementations. Also, this chapter will show you the implementation of a deep convolutional network for image classification with the PyTorch library.

Chapter 11,Sentiment Analysis with Recurrent Neural Networks, will guide you through the fundamentals of recurrent neural networks. You will learn about the different types of network cells, the required math concepts, and the differences of this learning algorithm compared to feedforward networks. Also, in this chapter, we will develop a recurrent neural network for sentiment analysis with the PyTorch library.

Chapter 12,Exporting and Importing Models, will show you how to save and load model parameters and architectures using various C++ libraries. Also, you will see how to use the ONNX format to load and use a pre-trained model with the C++ API of the Caffe2 library.

Chapter 13,Deploying Models on Mobile and Cloud Platforms, will guide you through the development of applications for image classification using neural networks for the Android and Google Compute Engine platforms.

To get the most out of this book

To be able to compile and run the examples included in this book, you will need to configure a particular development environment. All code examples have been tested with the Arch and Ubuntu 18.04 Linux distributions. The following list outlines the packages you'll need to install on the Ubuntu platform:

build-essential

unzip

git

cmake

cmake-curses-gui

python

python-pip

libblas-dev

libopenblas-dev

libatlas-base-dev

liblapack-dev

libboost-all-dev

libopencv-core3.2

libopencv-imgproc3.2

libopencv-dev

libopencv-highgui3.2

libopencv-highgui-dev

protobuf-compiler

libprotobuf-dev

libhdf5-dev

libjson-c-dev

libx11-dev

openjdk-8-jdk

wget

ninja-build

Also, you need to install the following additional packages for Python:

pyyaml

typing

Besides the development environment, you'll have to check out requisite third-party libraries' source code samples and build them. Most of these libraries are actively developed and don't have strict releases, so it's easier to check out a particular commit from the development tree and build it than downloading the latest official release. The following table shows you the libraries you have to check out, their repository URLs, and the hash number of the commit to check out:

Library repository

Branch name

Commit

https://github.com/shogun-toolbox/shogun

master

f7255cf2cc6b5116e50840816d70d21e7cc039bb

https://github.com/Shark-ML/Shark

master

221c1f2e8abfffadbf3c5ef7cf324bc6dc9b4315

https://gitlab.com/conradsnicta/armadillo-code

9.500.x

442d52ba052115b32035a6e7dc6587bb6a462dec

https://github.com/davisking/dlib

v19.15

929c630b381d444bbf5d7aa622e3decc7785ddb2

https://github.com/eigenteam/eigen-git-mirror

3.3.7

cf794d3b741a6278df169e58461f8529f43bce5d

https://github.com/mlpack/mlpack

master

e2f696cfd5b7ccda2d3af1c7c728483ea6591718

https://github.com/Kolkir/plotcpp

master

c86bd4f5d9029986f0d5f368450d79f0dd32c7e4

https://github.com/pytorch/pytorch

v1.2.0

8554416a199c4cec01c60c7015d8301d2bb39b64

https://github.com/xtensor-stack/xtensor

master

02d8039a58828db1ffdd2c60fb9b378131c295a2

https://github.com/xtensor-stack/xtensor-blas

master

89d9df93ff7306c32997e8bb8b1ff02534d7df2e

https://github.com/xtensor-stack/xtl

master

03a6827c9e402736506f3ded754e890b3ea28a98

https://github.com/opencv/opencv_contrib/releases/tag/3.3.0

3.3.0

https://github.com/ben-strasser/fast-cpp-csv-parser

master

3b439a664090681931c6ace78dcedac6d3a3907e

https://github.com/Tencent/rapidjson

master

73063f5002612c6bf64fe24f851cd5cc0d83eef9

Also, for the last chapter, you'll have to install the Android Studio IDE. You can download it from the official site at https://developer.android.com/studio. Besides the IDE, you'll also need to install and configure the Android SDK. The respective example in this book was developed and tested with this SDK, which can be downloaded from https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip. To configure this SDK, you have to unzip it and install particular packages. The following script shows how to do it:

mkdir /android

cd /android

wget https://dl.google.com/android/repository/sdk-tools-linux-4333796.zip

unzip sdk-tools-linux-4333796.zip

yes | ./tools/bin/sdkmanager --licenses

yes | ./tools/bin/sdkmanager "platform-tools"

yes | ./tools/bin/sdkmanager "platforms;android-25"

yes | ./tools/bin/sdkmanager "build-tools;25.0.2"

yes | ./tools/bin/sdkmanager "system-images;android-25;google_apis;armeabi-v7a"

yes | ./tools/bin/sdkmanager --install "ndk;20.0.5594570"

export ANDROID_NDK=/android/ndk/20.0.5594570

export ANDROID_ABI='armeabi-v7a'

Another way to configure the development environment is through the use of Docker. Docker allows you to configure a lightweight virtual machine with particular components. You can install Docker from the official Ubuntu package repository. Then, use the scripts provided with this book to automatically configure the environment. You will find the docker folder in the examples package. The following steps show how to use Docker configuration scripts:

Run the following commands to create the image, run it, and configure the environment:

cd docker

docker build -t buildenv:1.0 .

docker run -it buildenv:1.0 bash

cd /development

./install_env.sh

./install_android.sh

exit

Use the following command to save our Docker container with the configured libraries and packages into a new Docker image:

docker commit [container id]

Use the following command to rename the updated Docker image:

docker tag [image id] [new name]

Use the following command to start a new Docker container and share the book examples sources to it:

docker run -it -v [host_examples_path]:[container_examples_path] [tag name] bash

After running the preceding command, you will be in the command-line environment with the necessary configured packages, compiled third-party libraries, and with access to the programming examples package. You can use this environment to compile and run the code examples in this book. Each programming example is configured to use the CMake build system so you will be able to build them all in the same way. The following script shows a possible scenario of building a code example:

cd [example folder name]

mkdir build

cd build

cmake ..

cmake --build . --target all

Also, you can configure your local machine environment to share X Server with a Docker container to be able to run graphical UI applications from this container. It will allow you to use, for example, the Android Studio IDE or a C++ IDE (such as Qt Creator) from the Docker container, without local installation. The following script shows how to do this:

xhost +local:root

docker run --net=host -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -it -v [host_examples_path]:[container_examples_path] [tag name] bash

If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the following section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

To be more comfortable with understanding and building the code examples, we recommend you carefully read the documentation for each third-party library, and take some time to learn the basics of the Docker system and of development for the Android platform. Also, we assume that you have sufficient working knowledge of the C++ language and compilers, and that you are familiar with the CMake build system.

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

www.packt.com

Select the

Support

tab.

Click on

Code Downloads

Enter the name of the book in the

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Hands-On-Machine-Learning-with-CPP. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here:http://www.packtpub.com/sites/default/files/downloads/9781789955330_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText:Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles.Here is an example:"We downloaded a pre-trained model with the torch.hub.load() function."

A block of code is set as follows:

class Network { public: Network(const std::string& snapshot_path, const std::string& synset_path, torch::DeviceType device_type); std::string Classify(const at::Tensor& image); private: torch::DeviceType device_type_; Classes classes_; torch::jit::script::Module model_;};

Any command-line input or output is written as follows:

cd ~/[DEST_PATH]/server

mkdir build

cd build

cmake .. -DCMAKE_PREFIX_PATH=~/dev/server/third-party/libtorch

cmake --build . --target all

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: " Start it by clicking the Start button at the top of the page."

Warnings or important notes appear like this.

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book,mention the book title in the subject of your message and email us [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visitwww.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us [email protected] a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visitauthors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

Section 1: Overview of Machine Learning

In this section, we will delve into the basics of machine learning with the help of examples in C++ and various machine learning frameworks. We'll demonstrate how to load data from various file formats and describe model performance measuring techniques and the best model selection approaches.

This section comprises the following chapters:

Chapter 1

Introduction to Machine Learning with C++

Chapter 2

Data Processing

Chapter 3

Measuring Performance and Selecting Models

Introduction to Machine Learning with C++

There are different approaches to make computers solve tasks. One of them is to define an explicit algorithm, and another one is to use implicit strategies based on mathematical and statistical methods. Machine Learning (ML) is one of the implicit methods that uses mathematical and statistical approaches to solve tasks. It is an actively growing discipline, and a lot of scientists and researchers find it to be one of the best ways to move forward toward systems acting as human-level artificial intelligence(AI).

In general, ML approaches have the idea of searching patterns in a given dataset as their basis. Consider a recommendation system for a news feed, which provides the user with a personalized feed based on their previous activity or preferences. The software gathers information about the type of news article the user reads and calculates some statistics. For example, it could be the frequency of some topics appearing in a set of news articles. Then, it performs some predictive analytics, identifies general patterns, and uses them to populate the user's news feed. Such systems periodically track a user's activity, and update the dataset and calculate new trends for recommendations.

There are many areas where ML has started to play an important role. It is used for solving enterprise business tasks as well as for scientific researches. In customer relationship management (CRM) systems, ML models are used to analyze sales team activity, to help them to process the most important requests first. ML models are used in business intelligence (BI) and analytics to find essential data points. Human resource (HR) departments use ML models to analyze their employees' characteristics in order to identify the most effective ones and use this information when searching applicants for open positions.

A fast-growing direction of research is self-driving cars, and deep learning neural networks are used extensively in this area. They are used in computer vision systems for object identification as well as for navigation and steering systems, which are necessary for car driving.

Another popular use of ML systems is electronic personal assistants, such as Siri from Apple or Alexa from Amazon. Such products also use deep learning models to analyze natural speech or written text to process users' requests and make a natural response in a relevant context. Such requests can activate music players with preferred songs, as well as update a user's personal schedule or book flight tickets.

This chapter describes what ML is and which tasks can be solved with ML, and discusses different approaches used in ML. It aims to show the minimally required math to start implementing ML algorithms. It also covers how to perform basic linear algebra operations in libraries such as Eigen, xtensor, Shark-ML, Shogun, and Dlib, and also explains the linear regression task as an example.

The following topics will be covered in this chapter:

Understanding the fundamentals of ML

An overview of linear algebra

An overview of a linear regression example

Understanding the fundamentals of ML

There are different approaches to create and train ML models. In this section, we show what these approaches are and how they differ. Apart from the approach we use to create a ML model, there are also parameters that manage how this model behaves in the training and evaluation processes. Model parameters can be divided into two distinct groups, which should be configured in different ways. The last crucial part of the ML process is a technique that we use to train a model. Usually, the training technique uses some numerical optimization algorithm that finds the minimal value of a target function. In ML, the target function is usually called a loss function and is used for penalizing the training algorithm when it makes errors. We discuss these concepts more precisely in the following sections.

Venturing into the techniques of ML

We can divide ML approaches into two techniques, as follows:

Supervised learning is an approach based on the use of labeled data. Labeled data is a set of known data samples with corresponding known target outputs. Such a kind of data is used to build a model that can predict future outputs.

Unsupervised learning is an approach that does not require labeled data and can search hidden patterns and structures in an arbitrary kind of data.

Let's have a look at each of the techniques in detail.

Supervised learning

Supervised ML algorithms usually take a limited set of labeled data and build models that can make reasonable predictions for new data. We can split supervised learning algorithms into two main parts, classification and regression techniques, described as follows:

Classification models predict some finite and distinct types of categories—this could be a label that identifies if an email is spam or not, or whether an image contains a human face or not. Classification models are applied in speech and text recognition, object identification on images, credit scoring, and others. Typical algorithms for creating classification models are

Support Vector Machine

(

SVM

), decision tree approaches,

k-nearest neighbors

(

KNN

), logistic regression, Naive Bayes, and neural networks. The following chapters describe the details of some of these algorithms.

Regression models predict continuous responses such as changes in temperature or values of currency exchange rates. Regression models are applied in algorithmic trading, forecasting of electricity load, revenue prediction, and others. Creating a regression model usually makes sense if the output of the given labeled data is real numbers. Typical algorithms for creating regression models are linear and multivariate regressions, polynomial regression models, and stepwise regressions. We can use decision tree techniques and neural networks to create regression models too. The following chapters describe the details of some of these algorithms.

Unsupervised learning

Unsupervised learning algorithms do not use labeled datasets. They create models that use intrinsic relations in data to find hidden patterns that they can use for making predictions. The most well-known unsupervised learning technique is clustering. Clustering involves dividing a given set of data in a limited number of groups according to some intrinsic properties of data items. Clustering is applied in market researches, different types of exploratory analysis, deoxyribonucleic acid (DNA) analysis, image segmentation, and object detection. Typical algorithms for creating models for performing clustering are k-means, k-medoids, Gaussian mixture models, hierarchical clustering, and hidden Markov models. Some of these algorithms are explained in the following chapters of this book.

Dealing with ML models

We can interpret ML models as functions that take different types of parameters. Such functions provide outputs for given inputs based on the values of these parameters. Developers can configure the behavior of ML models for solving problems by adjusting model parameters. Training a ML model can usually be treated as a process of searching the best combination of its parameters. We can split the ML model's parameters into two types. The first type consists of parameters internal to the model, and we can estimate their values from the training (input) data. The second type consists of parameters external to the model, and we cannot estimate their values from training data. Parameters that are external to the model are usually called hyperparameters.

Internal parameters have the following characteristics:

They are necessary for making predictions.

They define the quality of the model on the given problem.

We can learn them from training data.

Usually, they are a part of the model.

If the model contains a fixed number of internal parameters, it is called parametric. Otherwise, we can classify it as non-parametric.

Examples of internal parameters are as follows:

Weights of

artificial neural networks

(

ANNs

)

Support vector values for SVM models

Polynomial coefficients for linear regression or logistic regression

On the other hand, hyperparameters have the following characteristics:

They are used to configure algorithms that estimate model parameters.

The practitioner usually specifies them.

Their estimation is often based on using heuristics.

They are specific to a concrete modeling problem.

It is hard to know the best values for a model's hyperparameters for a specific problem. Also, practitioners usually need to perform additional research on how to tune required hyperparameters so that a model or a training algorithm behaves in the best way. Practitioners use rules of thumb, copying values from similar projects, as well as special techniques such as grid search for hyperparameter estimation.

Examples of hyperparameters are as follows:

C and sigma parameters used in the SVM algorithm for a classification quality configuration

The learning rate parameter that is used in the neural network training process to configure algorithm convergence

The

value that is used in the KNN algorithm to configure the number of neighbors

Model parameter estimation

Model parameter estimation usually uses some optimization algorithm. The speed and quality of the resulting model can significantly depend on the optimization algorithm chosen. Research on optimization algorithms is a popular topic in industry, as well as in academia. ML often uses optimization techniques and algorithms based on the optimization of a loss function. A function that evaluates how well a model predicts on the data is called a loss function. If predictions are very different from the target outputs, the loss function will return a value that can be interpreted as a bad one, usually a large number. In such a way, the loss function penalizes an optimization algorithm when it moves in the wrong direction. So, the general idea is to minimize the value of the loss function to reduce penalties. There is no one universal loss function for optimization algorithms. Different factors determine how to choose a loss function. Examples of such factors are as follows:

Specifics of the given problem—for example, if it is a regression or a classification model

Ease of calculating derivatives

Percentage of outliers in the dataset

In ML, the term optimizer is used to define an algorithm that connects a loss function and a technique for updating model parameters in response to the values of the loss function. So, optimizers tune ML models to predict target values for new data in the most accurate way by fitting model parameters. There are many optimizers: Gradient Descent, Adagrad, RMSProp, Adam, and others. Moreover, developing new optimizers is an active area of research. For example, there is the ML and Optimization research group at Microsoft (located in Redmond) whose research areas include combinatorial optimization, convex and non-convex optimization, and their application in ML and AI. Other companies in the industry also have similar research groups; there are many publications from Facebook Research, Amazon Research, and OpenAI groups.

An overview of linear algebra

The concepts of linear algebra are essential for understanding the theory behind ML because they help us understand how ML algorithms work under the hood. Also, most ML algorithm definitions use linear algebra terms.

Linear algebra is not only a handy mathematical instrument, but also the concepts of linear algebra can be very efficiently implemented with modern computer architectures. The rise of ML, and especially deep learning, began after significant performance improvement of the modern Graphics Processing Unit (GPU). GPUs were initially designed to work with linear algebra concepts and massive parallel computations used in computer games. After that, special libraries were created to work with general linear algebra concepts. Examples of libraries that implement basic linear algebra routines are Cuda and OpenCL, and one example of a specialized linear algebra library is cuBLAS. Moreover, it became more common to use general-purpose graphics processing units (GPGPUs) because these turn the computational power of a modern GPU into a powerful general-purpose computing resource.

Also, Central Processing Units (CPUs) have instruction sets specially designed for simultaneous numerical computations. Such computations are called vectorized, and common vectorized instruction sets are AVx, SSE, and MMx. There is also a term Single Instruction Multiple Data (SIMD) for these instruction sets. Many numeric linear algebra libraries, such as Eigen, xtensor, VienaCL, and others, use them to improve computational performance.

Learning the concepts of linear algebra

Linear algebra is a big area. It is the section of algebra that studies objects of a linear nature: vector (or linear) spaces, linear representations, and systems of linear equations. The main tools used in linear algebra are determinants, matrices, conjugation, and tensor calculus.

To understand ML algorithms, we only need a small set of linear algebra concepts. However, to do researches on new ML algorithms, a practitioner should have a deep understanding of linear algebra and calculus.

The following list contains the most valuable linear algebra concepts for understanding ML algorithms:

Scalar:

This is a single number.

Vector:

This

is an array of ordered numbers. Each element has a distinct index. Notation for vectors is a bold lowercase typeface for names and an italic typeface with a subscript for elements, as shown in the following example:

Matrix:

This

is a two-dimensional array of numbers. Each element has a distinct pair of indices. Notation for matrices is a bold uppercase typeface for names and an italic but not bold typeface with a comma-separated list of indices in subscript for elements, as shown in the following example:

Tensor:

This i

s an array of numbers arranged in a multidimensional regular grid, and represents generalizations of matrices. It is like a multidimensional matrix. For example, tensor

with dimensions 2 x 2 x 2 can look like this:

Linear algebra libraries and ML frameworks usually use the concept of a tensor instead of a matrix because they implement general algorithms, and a matrix is just a special case of a tensor with two dimensions. Also, we can consider a vector as a matrix of size n x 1.

Tensor representation in computing

We can represent tensor objects in computer memory in different ways. The most obvious method is a simple linear array in computer memory (random-access memory, or RAM). However, the linear array is also the most computationally effective data structure for modern CPUs. There are two standard practices to organize tensors with a linear array in memory: row-major ordering and column-major ordering. In row-major ordering, we place consecutive elements of a row in linear order one after the other, and each row is also placed after the end of the previous one. In column-major ordering, we do the same but with the column elements. Data layouts have a significant impact on computational performance because the speed of traversing an array relies on modern CPU architectures that work with sequential data more efficiently than with non-sequential data. CPU caching effects are the reasons for such behavior. Also, a contiguous data layout makes it possible to use SIMD vectorized instructions that work with sequential data more efficiently, and we can use them as a type of parallel processing.

Different libraries, even in the same programming language, can use different ordering. For example, Eigen uses column-major ordering, but PyTorch uses row-major ordering. So, developers should be aware of internal tensor representation in libraries they use, and also take care of this when performing data loading or implementing algorithms from scratch.

Consider the following matrix:

Then, in the row-major data layout, members of the matrix will have the following layout in memory:

In the case of the column-major data layout, order layout will be the next, as shown here:

Linear algebra API samples

Consider some C++ linear algebra APIs (short for Application Program Interface), and look at how we can use them for creating linear algebra primitives and perform algebra operations with them.