40,81 €
Discover best practices for choosing, building, training, and improving deep learning models using Keras-R, and TensorFlow-R libraries
Key Features
Book Description
Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data. Advanced Deep Learning with R will help you understand popular deep learning architectures and their variants in R, along with providing real-life examples for them.
This deep learning book starts by covering the essential deep learning techniques and concepts for prediction and classification. You will learn about neural networks, deep learning architectures, and the fundamentals for implementing deep learning with R. The book will also take you through using important deep learning libraries such as Keras-R and TensorFlow-R to implement deep learning algorithms within applications. You will get up to speed with artificial neural networks, recurrent neural networks, convolutional neural networks, long short-term memory networks, and more using advanced examples. Later, you'll discover how to apply generative adversarial networks (GANs) to generate new images; autoencoder neural networks for image dimension reduction, image de-noising and image correction and transfer learning to prepare, define, train, and model a deep neural network.
By the end of this book, you will be ready to implement your knowledge and newly acquired skills for applying deep learning algorithms in R through real-world examples.
What you will learn
Who this book is for
This book is for data scientists, machine learning practitioners, deep learning researchers and AI enthusiasts who want to develop their skills and knowledge to implement deep learning techniques and algorithms using the power of R. A solid understanding of machine learning and working knowledge of the R programming language are required.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 372
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Sunith ShettyAcquisition Editor:Reshma RamanContent Development Editor:Nazia ShaikhSenior Editor: Ayaan HodaTechnical Editor: Utkarsha S. KadamCopy Editor: Safis EditingProject Coordinator:Aishwarya MohanProofreader: Safis EditingIndexer:Tejal Daruwale SoniProduction Designer:Joshua Misquitta
First published: December 2019
Production reference: 1161219
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78953-877-9
www.packt.com
Packt.com
Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Fully searchable for easy access to vital information
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Bharatendra Rai is a chairperson and professor of business analytics, and the director of the Master of Science in Technology Management program at the Charlton College of Business at UMass Dartmouth. He received a Ph.D. in industrial engineering from Wayne State University, Detroit. He received a master's in quality, reliability, and OR from Indian Statistical Institute, India. His current research interests include machine learning and deep learning applications. His deep learning lecture videos on YouTube are watched in over 198 countries. He has over 20 years of consulting and training experience in industries such as software, automotive, electronics, food, chemicals, and so on, in the areas of data science, machine learning, and supply chain management.
Herbert Ssegane is an IT data scientist at Oshkosh Corporation, USA with extensive experience in machine learning, deep learning, statistical analysis, and environmental modeling. He has worked on multiple projects for The Climate Corporation, Monsanto (now Bayer), Argonne National Laboratory, and the U.S. Forest Services. He holds a Ph.D in biological and agricultural engineering from the University of Georgia, Athens USA.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Advanced Deep Learning with R
About Packt
Why subscribe?
Contributors
About the author
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Section 1: Revisiting Deep Learning Basics
Revisiting Deep Learning Architecture and Techniques
Deep learning with R
Deep learning trend
Versions of key R packages used
Process of developing a deep network model
Preparing the data for a deep network model
Developing a deep learning model architecture
Compiling the model
Fitting the model
Assessing the model performance
Deep learning techniques with R and RStudio
Multi-class classification
Regression problems
Image classification
Convolutional neural networks
Autoencoders
Transfer learning
Generative adversarial networks
Deep network for text classification 
Recurrent neural networks 
Long short-term memory network
Convolutional recurrent networks
Tips, tricks, and best practices
Summary
Section 2: Deep Learning for Prediction and Classification
Deep Neural Networks for Multi-Class Classification
Cardiotocogram dataset
Dataset (medical)
Preparing the data for model building
Normalizing numeric variables
Partitioning the data
One-hot encoding
Creating and fitting a deep neural network model
Developing model architecture
Compiling the model
Fitting the model
Model evaluation and predictions
Loss and accuracy calculation
Confusion matrix
Performance optimization tips and best practices
Experimenting with an additional hidden layer
Experimenting with a higher number of units in the hidden layer
Experimenting using a deeper network with more units in the hidden layer
Experimenting by addressing the class imbalance problem
Saving and reloading a model
Summary
Deep Neural Networks for Regression
Understanding the Boston Housing dataset
Preparing the data
Visualizing the neural network
Data partitioning
Normalization
Creating and fitting a deep neural network model for regression
Calculating the total number of parameters
Compiling the model
Fitting the model
Model evaluation and prediction
Evaluation
Prediction
Improvements
Deeper network architecture
Results
Performance optimization tips and best practices
Log transformation on the output variable
Model performance
Summary
Section 3: Deep Learning for Computer Vision
Image Classification and Recognition
Handling image data
Data preparation
Resizing and reshaping
Training, validation, and test data
One-hot encoding
Creating and fitting the model
Developing the model architecture
Compiling the model
Fitting the model
Model evaluation and prediction
Loss, accuracy, and confusion matrices for training data
Prediction probabilities for training data
Loss, accuracy, and confusion matrices for test data
Prediction probabilities for test data
Performance optimization tips and best practices
Deeper networks
Results
Summary
Image Classification Using Convolutional Neural Networks
Data preparation
Fashion-MNIST data
Train and test data
Reshaping and resizing
One-hot encoding
Layers in the convolutional neural networks
Model architecture and related calculations
Compiling the model
Fitting the model
Accuracy and loss
Model evaluation and prediction
Training data
Test data
20 fashion items from the internet
Performance optimization tips and best practices
Image modification
Changes to the architecture
Summary
Applying Autoencoder Neural Networks Using Keras
Types of autoencoders
Dimension reduction autoencoders
MNIST fashion data
Encoder model
Decoder model
Autoencoder model
Compiling and fitting the model
Reconstructed images
Denoising autoencoders
MNIST data
Data preparation
Adding noise
Encoder model
Decoder model
Autoencoder model
Fitting the model
Image reconstruction
Image correction
Images that need correction
Clean images
Encoder model
Decoder model
Compiling and fitting the model
Reconstructing images from training data
Reconstructing images from new data
Summary
Image Classification for Small Data Using Transfer Learning
Using a pretrained model to identify an image
Reading an image
Preprocessing the input
Top five categories
Working with the CIFAR10 dataset
Sample images
Preprocessing and prediction
Image classification with CNN
Data preparation
CNN model
Model performance
Performance assessment with training data
Performance assessment with test data
Classifying images using the pretrained RESNET50 model
Model architecture
Freezing pretrained network weights
Fitting the model
Model evaluation and prediction
Loss, accuracy, and confusion matrix with the training data
Loss, accuracy, and confusion matrix with the test data
Performance optimization tips and best practices
Experimenting with the adam optimizer
Hyperparameter tuning
Experimenting with VGG16 as a pretrained network
Summary
Creating New Images Using Generative Adversarial Networks
Generative adversarial network overview
Processing MNIST image data
Digit five from the training data
Data processing
Developing the generator network
Network architecture
Summary of the generator network
Developing the discriminator network
Architecture
Summary of the discriminator network
Training the network
Initial setup for saving fake images and loss values
Training process
Reviewing results
Discriminator and GAN losses
Fake images
Performance optimization tips and best practices
Changes in the generator and discriminator network
Impact of these changes on the results
Generating a handwritten image of digit eight
Summary
Section 4: Deep Learning for Natural Language Processing
Deep Networks for Text Classification
Text datasets
The UCI machine learning repository
Text data within Keras
Preparing the data for model building
Tokenization
Converting text into sequences of integers
Padding and truncation
Developing a tweet sentiment classification model
Developing deep neural networks
Obtaining IMDb movie review data
Building a classification model
Compiling the model
Fitting the model
Model evaluation and prediction
Evaluation using training data
Evaluation using test data
Performance optimization tips and best practices
Experimenting with the maximum sequence length and the optimizer
Summary
Text Classification Using Recurrent Neural Networks
Preparing data for model building
Padding sequences
Developing a recurrent neural network model
Calculation of parameters
Compiling the model
Fitting the model
Accuracy and loss
Model evaluation and prediction
Training the data
Testing the data
Performance optimization tips and best practices
Number of units in the simple RNN layer
Using different activation functions in the simple RNN layer
Adding more recurrent layers 
The maximum length for padding sequences
Summary
Text classification Using Long Short-Term Memory Network
Why do we use LSTM networks?
Preparing text data for model building
Creating a long short-term memory network model
LSTM network architecture
Compiling the LSTM network model
Fitting the LSTM model
Loss and accuracy plot
Evaluating model performance 
Model evaluation with train data
Model evaluation with test data
Performance optimization tips and best practices
Experimenting with the Adam optimizer
Experimenting with the LSTM network having an additional layer
Experimenting with a bidirectional LSTM layer
Summary
Text Classification Using Convolutional Recurrent Neural Networks
Working with the reuter_50_50 dataset
Reading the training data
Reading the test data
Preparing the data for model building
Tokenization and converting text into a sequence of integers
Changing labels into integers
Padding and truncation of sequences
Data partitioning
One-hot encoding the labels
Developing the model architecture
Compiling and fitting the model
Compiling the model
Fitting the model
Evaluating the model and predicting classes
Model evaluation with training data
Model evaluation with test data
Performance optimization tips and best practices
Experimenting with reduced batch size
Experimenting with batch size, kernel size, and filters in CNNs
Summary
Section 5: The Road Ahead
Tips, Tricks, and the Road Ahead
TensorBoard for training performance visualization
Visualizing deep network models with LIME
Visualizing model training with tfruns
Early stopping of network training
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
Deep learning is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data. Advanced Deep Learning with R will help you understand popular deep learning architectures and their variants in R and provide real-life examples.
This book will help you apply deep learning algorithms in R using advanced examples. It covers variants of neural network models such as ANN, CNN, RNN, LSTM, and others using expert techniques. In the course of reading this book, you will make use of popular deep learning libraries such as Keras-R, TensorFlow-R, and others to implement AI models.
This book is for data scientists, machine learning practitioners, deep learning researchers, and AI enthusiasts who want to develop their skills and knowledge to implement deep learning techniques and algorithms using the power of R. A solid understanding of machine learning and a working knowledge of the R programming language are required.
Chapter 1, Revisiting Deep Learning Architecture and Techniques, provides an overview of the deep learning techniques that are covered in this book.
Chapter 2, Deep Neural Networks for Multiclass Classification, covers the necessary steps to apply deep learning neural networks to binary and multiclass classification problems. The steps are illustrated using a churn dataset and include data preparation, one-hot encoding, model fitting, model evaluation, and prediction.
Chapter 3, Deep Neural Networks for Regression, illustrates how to develop a prediction model for numeric response. Using the Boston Housing example, this chapter introduces the steps for data preparation, model creation, model fitting, model evaluation, and prediction.
Chapter 4, Image Classification and Recognition, illustrates the use of deep learning neural networks for image classification and recognition using the Keras package with the help of an easy-to-follow example. The steps involved include exploring image data, resizing and reshaping images, one-hot encoding, developing a sequential model, compiling the model, fitting the model, evaluating the model, prediction, and model performance assessment using a confusion matrix.
Chapter 5, Image Classification Using Convolutional Neural Networks, introduces the steps for applying image classification and recognition using convolutional neural networks (CNNs) with an easy-to-follow practical example. CNN is a popular deep neural network and is considered the 'gold standard' for large-scale image classification.
Chapter 6, Applying Autoencoder Neural Networks Using Keras, goes over the steps for applying autoencoder neural networks using Keras. The practical example used illustrates the steps for taking images as input, training them with an autoencoder, and finally, reconstructing images.
Chapter 7, Image Classification for Small Data Using Transfer Learning, illustrates the application of transfer learning to NLP. The steps involved include data preparation, defining a deep neural network model in Keras, training the model, and model assessment.
Chapter 8,Creating New Images Using Generative Adversarial Networks, illustrates the application of generative adversarial networks (GANs) to generate new images using a practical example. The steps for image classification include image data preprocessing, feature extraction, developing an RBM model, and model performance assessment.
Chapter 9, Deep Network for Text Classification, provides the steps for applying text classification using deep neural networks and illustrates the process with an easy-to-follow example. Text data, such as customer comments, product reviews, and movie reviews, play an important role in business, and text classification is an important deep learning problem.
Chapter 10,Text Classification Using Recurrent Neural Networks, provides the steps for applying recurrent neural networks to an image classification problem with the help of a practical example. The steps covered include data preparation, defining the recurrent neural network model, training, and finally, the evaluation of the model performance.
Chapter 11 ,Text Classification Using a Long Short-Term Memory Network, illustrates the steps for using a long short-term memory (LSTM) neural network for sentiment classification. The steps involved include text data preparation, creating an LSTM model, training the model, and assessing the model.
Chapter 12, Text Classification Using Convolutional Recurrent Networks, illustrates the application of recurrent convolutional networks for news classification. The steps involved include text data preparation, defining a recurrent convolutional network model in Keras, training the model, and model assessment.
Chapter 13, Tips, Tricks, and the Road Ahead, discusses the road ahead in terms of putting deep learning into action and best practices.
The following are a few ideas for how you can get the most out of this book:
All examples in this book use R codes. So before getting started with it, you should havea good foundation in the R language. As per Confucius, "I hear and I forget. I see and I remember. I do and I understand." This is true for this book, too. A hands-on approach of working with the codes while going through the chapters will be very useful in understanding the deep learning models.
All the codes in this book were successfully run on a Mac computer that had 8 GB of RAM. However, if you are working with a much larger dataset compared to what has been used in this book for illustration purposes, more powerful computing resources may be required in order to develop deep learning models. It will also be helpful to have a good foundation in statistical methods.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
Support
tab.
Click on
Code Downloads
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Advanced-Deep-Learning-with-R. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781789538779_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
This section contains a chapter that serves as an introduction to deep learning with R. It provides an overview of the process for developing deep networks and reviews popular deep learning techniques.
This section contains the following chapter:
Chapter 1
,
Revisiting Deep Learning Architecture and Techniques
Deep learning is part of a broader machine learning and artificial intelligence field that uses artificial neural networks. One of the main advantages of deep learning methods is that they help to capture complex relationships and patterns contained in data. When the relationships and patterns are not very complex, traditional machine learning methods may work well. However, with the availability of technologies that help to generate and process more and more unstructured data, such as images, text, and videos, deep learning methods have become increasingly popular as they are almost a default choice to deal with such data. Computer vision and natural language processing (NLP) are two areas that are seeing interesting applications in a wide variety of fields, such as driverless cars, language translation, computer games, and even creating new artwork.
Within the deep learning toolkit, we now have an increasing array of neural network techniques that can be applied to a specific type of task. For example, when developing image classification models, a special type of deep network called a convolutional neural network (CNN) has proved to be effective in capturing unique patterns that exist in image-related data. Similarly, another popular deep learning network called recurrent neural networks (RNNs) and its variants have been found useful in dealing with data involving sequences of words or integers. Another popular and interesting deep learning network called a generative adversarial network (GAN) has the capability to generate new images, speech, music, or artwork.
In this book, we will use these and other popular deep learning networks using R software. Each chapter presents a complete example that has been specifically developed to run on a regular laptop or computer. The main idea is to avoid getting bogged down by a huge amount of data needing advanced computing resources at the first stage of applying deep learning methods. You will be able to go over all the steps using the illustrated examples in this book. The examples used also include the best practices for each topic, and you will find them useful. You will also find a hands-on and applied approach helpful in quickly seeing the big picture when trying to replicate these deep learning methods when faced with a new problem.
This chapter provides an overview of the deep learning methods with R that are covered in this book. We will go over the following topics in this chapter:
Deep learning with R
The process of developing a deep network model
Popular deep learning techniques with R and RStudio
We will start by looking at the popularity of deep learning networks and also take a look at a version of some of the important R packages used in this book.
Deep learning techniques make use of neural network-based models and have seen increasing interest in the last few years.A Google trends website for the search term deep learning provides the following plot:
The preceding plot has 100 as the peak popularity of a search term, and other numbers are relative to this highest point. It can be observed that the interest in the term deep learning has gradually increased in popularity since around 2014. For the last two years, it has enjoyed peak popularity. One of the reasons for the popularity of deep learning networks is the availability of the free and open source libraries, TensorFlow and Keras.
In this book, we will use the Keras R package that uses TensorFlow as a backend for building deep learning networks. An output from a typical R session, used for the examples illustrated in this book, providing various version-related information, is provided in the following code:
# Information from a Keras R sessionsessionInfo()R version 3.6.0 (2019-04-26)Platform: x86_64-apple-darwin15.6.0 (64-bit)Running under: macOS 10.15Matrix products: defaultBLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylibLAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylibRandom number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding locale:[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8attached base packages:[1] stats graphics grDevices utils datasets methods baseother attached packages:[1] keras_2.2.4.1loaded via a namespace (and not attached): [1] Rcpp_1.0.2 lattice_0.20-38 lubridate_1.7.4 zeallot_0.1.0 [5] grid_3.6.0 R6_2.4.0 jsonlite_1.6 magrittr_1.5 [9] tfruns_1.4 stringi_1.4.3 whisker_0.4 Matrix_1.2-17 [13] reticulate_1.13 generics_0.0.2 tools_3.6.0 stringr_1.4.0 [17] compiler_3.6.0 base64enc_0.1-3 tensorflow_1.14.0
As seen previously, for this book we have used the 3.6 version of R that was released in April 2019. The nickname for this R version is Planting of a Tree. The version used for the Keras package is 2.2.4.1. In addition, all the application examples illustrated in the book have been run on a Mac computer with 8 GB of RAM. The main reason for using this specification is that it will allow a reader to go through all the examples without needing advanced computing resources to get started with any deep learning network covered in the book.
In the next section, we will go over the process of developing a deep network model that is broken down into five general steps.
Developing a deep learning network model can be broken down into five key steps shown in the following flowchart:
Each step mentioned in the preceding flowchart can have varying requirements based on the type of data used, the type of deep learning network being developed, and also the main objective of developing a model. We will go over each step to develop a general idea about what is involved.
Developing deep learning neural network models requires the variables to have a certain format. Independent variables may come with a varying scale, with some variable values in decimals and some other variables in thousands. Using such varying scales of variables is not very efficient when training a network. Before developing deep learning networks, we make changes such that the variables have similar scales. The process used for achieving this is called normalization.
Two commonly used methods for normalization are z-score normalization and min-max normalization. In z-score normalization, we subtract the mean from each value and divide it by the standard deviation. This transformation results in values that lie between -3 and +3 with a mean of 0 and a standard deviation of 1. For a min-max normalization, we subtract the minimum value from each data point, and then divide it by the range. This transformation converts data to having values between zero and one.
As an example, see the following plots, where we have obtained 10,000 data points randomly from a normal distribution with a mean of 35 and a standard deviation of 5:
From the preceding plots, we can observe that after z-score normalization, the data points mostly lie between -3 and +3. Similarly, after min-max normalization, the range of values changes to data points between 0 and 1. However, the overall pattern seen in the original data is retained after both types of normalization.
Another important step in preparing data when using a categorical response variable is to carry out one-hot encoding. One-hot encoding converts a categorical variable to a new binary format that has values containing either 0 or 1. This is achieved very easily by using the to_categorical() function available in Keras.
Typically data processing steps for unstructured data, such as image or text, are more involved compared with a situation where we are dealing with structured data. In addition, the nature of data preparation steps can vary from one type of data to another. For example, the way we prepare image data for developing a deep learning classification model is likely to be very different from the way we prepare text data for developing a movie review sentiment classification model. However, one important thing to note is that before we can develop deep learning models from unstructured data, they need to be first converted into a structured format. An example of converting unstructured image data into a structured format is shown in the following screenshot, using a picture of the handwritten digit five:
As can be observed from the preceding screenshot, when we read an image file containing a black and white handwritten digit five with 28 x 28 dimensions in R, it gets converted to numbers in rows and columns, giving it a structured format. The right-hand side of the screenshot shows data with 28 rows and 28 columns. The numbers in the body of the table are pixel values that range from 0 to 255, where a value of zero represents the black color and 255 represents the white color in the picture. When developing deep learning models, we make use of some forms of such structured data that are derived from image data.
Once the data for developing the model is prepared in the required format, we can then develop the model architecture.
Assessing the performance of a deep learning classification model requires developing a confusion matrix that summarizes predictions for actual and predicted classes. Consider an example where a classification model is developed to classify graduate school applicants in one of two categories where class 0 refers to applications that have not been accepted, and class 1 refers to accepted applications. An example of a confusion matrix for this situation explaining the key concepts is provided as follows:
In the preceding confusion matrix, there are 208 applicants who were actually not accepted and the model also correctly predicts that they should not be accepted. This cell in the confusion matrix is also called the true negative. Similarly, there are 29 applicants who were actually accepted and the model also correctly predicts that they should be accepted. This cell in the confusion matrix is called the true positive. We also have cells with numbers indicating the incorrect classification of applicants by the model. There are 15 applicants who were actually not accepted, but the model incorrectly predicts that they should be accepted and this cell is called a false negative.
Another name for making an error when incorrectly classifying category 0 as belonging to category 1 is a type-1 error. Finally, there are 73 applicants that were actually accepted but the model incorrectly predicts them to belong to the not-accepted category, and this cell is called a false positive. Aanother name for such an incorrect classification is a type-2 error.
From the confusion matrix, we can calculate the accuracy of the classification performance by adding numbers to the diagonal and dividing the numbers by the total. So, the accuracy based on the preceding matrix is (208+29)/(208+29+73+15), or 72.92%. Apart from the accuracy, we can also find out the model performance in correctly classifying each category. We can calculate the accuracy of correctly classifying category 1, also called sensitivity, as 29/(29+73), or 28.4%. Similarly, we can calculate the accuracy of correctly classifying category 0, also called specificity, as 208/(208+15), or 93.3%.
Note, that the confusion matrix can be used when developing a classification model. However, other situations may call for other suitable ways of assessing the deep learning network.
We can now briefly go over the deep learning techniques covered in this book.
The term deep in deep learning refers to a neural network model having several layers, and the learning takes place with the help of data. And based on the type of data used, deep learning may be categorized into two major categories, as shown in the following screenshot:
As shown in the preceding diagram, the type of data used for developing a deep neural network model can be of a structured or unstructured type. In Chapter 2, Deep Neural Networks for Multi-Class Classification
