37,19 €
Uncover the power of artificial neural networks by implementing them through R code.
This book is intended for anyone who has a statistical background with knowledge in R and wants to work with neural networks to get better results from complex data. If you are interested in artificial intelligence and deep learning and you want to level up, then this book is what you need!
Neural networks are one of the most fascinating machine learning models for solving complex computational problems efficiently. Neural networks are used to solve wide range of problems in different areas of AI and machine learning.
This book explains the niche aspects of neural networking and provides you with foundation to get started with advanced topics. The book begins with neural network design using the neural net package, then you'll build a solid foundation knowledge of how a neural network learns from data, and the principles behind it. This book covers various types of neural network including recurrent neural networks and convoluted neural networks. You will not only learn how to train neural networks, but will also explore generalization of these networks. Later we will delve into combining different neural network models and work with the real-world use cases.
By the end of this book, you will learn to implement neural network models in your applications with the help of practical examples in the book.
A step-by-step guide filled with real-world practical examples.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 261
Veröffentlichungsjahr: 2017
BIRMINGHAM - MUMBAI
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: September 2017
Production reference: 1220917
ISBN 978-1-78839-787-2
www.packtpub.com
Authors
Giuseppe Ciaburro
Balaji Venkateswaran
Copy Editors
Safis Editing
Alpha Singh
Vikrant Phadkay
Reviewer
Juan Tomás Oliva Ramos
Project Coordinator
Nidhi Joshi
Commissioning Editor
Sunith Shetty
Proofreader
Safis Editing
Acquisition Editor
Varsha Shetty
Indexer
Mariammal Chettiyar
Content Development Editor
Cheryl Dsa
Graphics
Tania Dutta
Technical Editor
Suwarna Patil
Production Coordinator
Arvindkumar Gupta
Giuseppe Ciaburro; holds a master's degree in chemical engineering from Università degli Studi di Napoli Federico II, and a master's degree in acoustic and noise control from Seconda Università degli Studi di Napoli. He works at the Built Environment Control Laboratory of Università degli Studi della Campania "Luigi Vanvitelli".
He has over 15 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in Python and R, and he has extensive experience of working with MATLAB. An expert in acoustics and noise control, Giuseppe has wide experience in teaching professional computer courses (about 15 years), dealing with e-learning as an author. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He is currently researching machine learning applications in acoustics and noise control.
Balaji Venkateswaran is an AI expert, data scientist, machine learning practitioner, and database architect. He has 17+ years of experience in investment banking payment processing, telecom billing, and project management. He has worked for major companies such as ADP, Goldman Sachs, MasterCard, and Wipro. Balaji is a trainer in data science, Hadoop, and Tableau. He holds a postgraduate degree PG in business analytics from Great Lakes Institute of Management, Chennai.
Balaji has expertise relating to statistics, classification, regression, pattern recognition, time series forecasting, and unstructured data analysis using text mining procedures. His main interests are neural networks and deep learning.
Balaji holds various certifications in IBM SPSS, IBM Watson, IBM big data architect, cloud architect, CEH, Splunk, Salesforce, Agile CSM, and AWS.
If you have any questions, don't hesitate to message him on LinkedIn (linkedin.com/in/balvenkateswaran); he will be more than glad to help fellow data scientists.
Juan Tomás Oliva Ramos is an environmental engineer from the university of Guanajuato, Mexico, with a master's degree in administrative engineering and quality. He has more than 5 years of experience in management and development of patents, technological innovation projects, and development of technological solutions through the statistical control of processes. He has been a teacher of statistics, entrepreneurship, and technological development of projects since 2011. He became an entrepreneur mentor, and started a new department of technology management and entrepreneurship at instituto Tecnologico Superior de Purisima del Rincon.
Juan is a Alfaomega reviewer and has worked on the book Wearable designs for Smart watches, Smart TVs and Android mobile devices.
He has developed prototypes through programming and automation technologies for the improvement of operations, which have been registered for patents.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review.
If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
Neural Network and Artificial Intelligence Concepts
Introduction
Inspiration for neural networks
How do neural networks work?
Layered approach
Weights and biases
Training neural networks
Supervised learning
Unsupervised learning
Epoch
Activation functions
Different activation functions
Linear function
Unit step activation function
Sigmoid
Hyperbolic tangent
Rectified Linear Unit
Which activation functions to use?
Perceptron and multilayer architectures
Forward and backpropagation
Step-by-step illustration of a neuralnet and an activation function
Feed-forward and feedback networks
Gradient descent
Taxonomy of neural networks
Simple example using R neural net library - neuralnet()
Let us go through the code line-by-line
Implementation using nnet() library
Let us go through the code line-by-line
Deep learning
Pros and cons of neural networks
Pros
Cons
Best practices in neural network implementations
Quick note on GPU processing
Summary
Learning Process in Neural Networks
What is machine learning?
Supervised learning
Unsupervised learning
Reinforcement learning
Training and testing the model
The data cycle
Evaluation metrics
Confusion matrix
True Positive Rate
True Negative Rate
Accuracy
Precision and recall
F-score
Receiver Operating Characteristic curve
Learning in neural networks
Back to backpropagation
Neural network learning algorithm optimization
Supervised learning in neural networks
Boston dataset
Neural network regression with the Boston dataset
Unsupervised learning in neural networks
Competitive learning
Kohonen SOM
Summary
Deep Learning Using Multilayer Neural Networks
Introduction of DNNs
R for DNNs
Multilayer neural networks with neuralnet
Training and modeling a DNN using H2O
Deep autoencoders using H2O
Summary
Perceptron Neural Network Modeling – Basic Models
Perceptrons and their applications
Simple perceptron – a linear separable classifier
Linear separation
The perceptron function in R
Multi-Layer Perceptron
MLP R implementation using RSNNS
Summary
Training and Visualizing a Neural Network in R
Data fitting with neural network
Exploratory analysis
Neural network model
Classifing breast cancer with a neural network
Exploratory analysis
Neural network model
The network training phase
Testing the network
Early stopping in neural network training
Avoiding overfitting in the model
Generalization of neural networks
Scaling of data in neural network models
Ensemble predictions using neural networks
Summary
Recurrent and Convolutional Neural Networks
Recurrent Neural Network
The rnn package in R
LSTM model
Convolutional Neural Networks
Step #1 – filtering
Step #2 – pooling
Step #3 – ReLU for normalization
Step #4 – voting and classification in the fully connected layer
Common CNN architecture - LeNet
Humidity forecast using RNN
Summary
Use Cases of Neural Networks – Advanced Topics
TensorFlow integration with R
Keras integration with R
MNIST HWR using R
LSTM using the iris dataset
Working with autoencoders
PCA using H2O
Autoencoders using H2O
Breast cancer detection using darch
Summary
Neural networks are one of the most fascinating machine learning models for solving complex computational problems efficiently. Neural networks are used to solve a wide range of problems in different areas of AI and machine learning.
This book explains the niche aspects of neural networking and provides you with the foundation to get started with advanced topics. The book begins with neural network design using the neuralnet package; then you'll build solid knowledge of how a neural network learns from data and the principles behind it. This book covers various types of neural networks, including recurrent neural networks and convoluted neural networks. You will not only learn how to train neural networks but also explore generalization of these networks. Later, we will delve into combining different neural network models and work with the real-world use cases.
By the end of this book, you will learn to implement neural network models in your applications with the help of the practical examples in the book.
Chapter 1, Neural Network and Artificial Intelligence Concepts, introduces the basic theoretical concepts of Artificial Neural Networks (ANN) and Artificial Intelligence (AI). It presents the simple applications of ANN and AI with usage of math concepts. Some introduction to R ANN functions is also covered.
Chapter 2, Learning Processes in Neural Networks, shows how to do exact inferences in graphical models and show applications as expert systems. Inference algorithms are the base components for learning and using these types of models. The reader must at least understand their use and a bit about how they work.
Chapter 3, Deep Learning Using Multilayer Neural Networks, is about understanding deep learning and neural network usage in deep learning. It goes through the details of the implementation using R packages. It covers the many hidden layers set up for deep learning and uses practical datasets to help understand the implementation.
Chapter 4, Perceptron Neural Network – Basic Models, helps understand what a perceptron is and the applications that can be built using it. This chapter covers an implementation of perceptrons using R.
Chapter 5, Training and Visualizing a Neural Network in R, covers another example of training a neural network with a dataset. It also gives a better understanding of neural networks with a graphical representation of input, hidden, and output layers using the plot() function in R.
Chapter 6, Recurrent and Convolutional Neural Networks, introduces Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN) with their implementation in R. Several examples are proposed to understand the basic concepts.
Chapter 7, Use Cases of Neural Networks – Advanced Topics, presents neural network applications from different fields and how neural networks can be used in the AI world. This will help the reader understand the practical usage of neural network algorithms. The reader can enhance his or her skills further by taking different datasets and running the R code.
This book is focused on neural networks in an R environment. We have used R version 3.4.1 to build various applications and the open source and enterprise-ready professional software for R, RStudio version 1.0.153. We focus on how to utilize various R libraries in the best possible way to build real-world applications. In that spirit, we have tried to keep all the code as friendly and readable as possible. We feel that this will enable our readers to easily understand the code and readily use it in different scenarios.
This book is intended for anyone who has a statistics background with knowledge in R and wants to work with neural networks to get better results from complex data. If you are interested in artificial intelligence and deep learning and want to level up, then this book is what you need!
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The line in R includes theneuralnet()library in our program."
Any command-line input or output is written as follows:
mydata=read.csv('Squares.csv',sep=",",header=TRUE)
mydata
attach(mydata)
names(mydata)
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "A reference page in theHelpbrowser."
Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply [email protected], and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you. You can download the code files by following these steps:
Log in or register to our website using your email address and password.
Hover the mouse pointer on the
SUPPORT
tab at the top.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on
Code Download
.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Neural-Networks-with-R. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
From the scientific and philosophical studies conducted over the centuries, special mechanisms have been identified that are the basis of human intelligence. Taking inspiration from their operations, it was possible to create machines that imitate part of these mechanisms. The problem is that they have not yet succeeded in imitating and integrating all of them, so the Artificial Intelligence (AI) systems we have are largely incomplete.
A decisive step in the improvement of such machines came from the use of so-called Artificial Neural Networks (ANNs) that, starting from the mechanisms regulating natural neural networks, plan to simulate human thinking. Software can now imitate the mechanisms needed to win a chess match or to translate text into a different language in accordance with its grammatical rules.
This chapter introduces the basic theoretical concepts of ANN and AI. Fundamental understanding of the following is expected:
Basic high school mathematics; differential calculus and functions such as
sigmoid
R programming and usage of R libraries
We will go through the basics of neural networks and try out one model using R. This chapter is a foundation for neural networks and all the subsequent chapters.
We will cover the following topics in this chapter:
ANN concepts
Neurons, perceptron, and multilayered neural networks
Bias, weights, activation functions, and hidden layers
Forward and backpropagation methods
Brief overview of
Graphics Processing Unit
(
GPU
)
At the end of the chapter, you will be able to recognize the different neural network algorithms and tools which R provides to handle them.
The brain is the most important organ of the human body. It is the central processing unit for all the functions performed by us. Weighing only 1.5 kilos, it has around 86 billion neurons. A neuron is defined as a cell transmitting nerve impulses or electrochemical signals. The brain is a complex network of neurons which process information through a system of several interconnected neurons. It has always been challenging to understand the brain functions; however, due to advancements in computing technologies, we can now program neural networks artificially.
The discipline of ANN arose from the thought of mimicking the functioning of the same human brain that was trying to solve the problem. The drawbacks of conventional approaches and their successive applications have been overcome within well-defined technical environments.
AI or machine intelligence is a field of study that aims to give cognitive powers to computers to program them to learn and solve problems. Its objective is to simulate computers with human intelligence. AI cannot imitate human intelligence completely; computers can only be programmed to do some aspects of the human brain.
Machine learning is a branch of AI which helps computers to program themselves based on the input data. Machine learning gives AI the ability to do data-based problem solving. ANNs are an example of machine learning algorithms.
Deep learning (DL) is complex set of neural networks with more layers of processing, which develop high levels of abstraction. They are typically used for complex tasks, such as image recognition, image classification, and hand writing identification.
Most of the audience think that neural networks are difficult to learn and use it as a black box. This book intends to open the black box and help one learn the internals with implementation in R. With the working knowledge, we can see many use cases where neural networks can be made tremendously useful seen in the following image:
Neural networks are inspired by the way the human brain works. A human brain can process huge amounts of information using data sent by human senses (especially vision). The processing is done by neurons, which work on electrical signals passing through them and applying flip-flop logic, like opening and closing of the gates for signal to transmit through. The following images shows the structure of a neuron:
The major components of each neuron are:
Dendrites
: Entry points in each neuron which take input from other neurons in the network in form of electrical impulses
Cell Body
: It generates inferences from the dendrite inputs and decides what action to take
Axon terminals
: They transmit outputs in form of electrical impulses to next neuron
Each neuron processes signals only if it exceeds a certain threshold. Neurons either fire or do not fire; it is either 0 or 1.
AI has been a domain for sci-fi movies and fiction books. ANNs within AI have been around since the 1950s, but we have made them more dominant in the past 10 years due to advances in computing architecture and performance. There have been major advancements in computer processing, leading to:
Massive parallelism
Distributed representation and computation
Learning and generalization ability
Fault tolerance
Low energy consumption
In the domain of numerical computations and symbol manipulation, solving problems on-top of centralized architecture, modern day computers have surpassed humans to a greater extent. Where they actually lag behind with such an organizing structure is in the domains of pattern recognition, noise reduction, and optimizing. A toddler can recognize his/her mom in a huge crowd, but a computer with a centralized architecture wouldn’t be able to do the same.
This is where the biological neural network of the brain has been outperforming machines, and hence the inspiration to develop an alternative loosely held, decentralized architecture mimicking the brain.
ANNs are massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections.
One of the leading global news agencies, Guardian, used big data in digitizing the archives by uploading the snapshots of all the archives they had had. However, for a user to copy the content and use it elsewhere is the limitation here. To overcome that, one can use an ANN for text pattern recognition to convert the images to text file and then to any format according to the needs of the end-users.
Similar to the biological neuron structure, ANNs define the neuron as a central processing unit, which performs a mathematical operation to generate one output from a set of inputs. The output of a neuron is a function of the weighted sum of the inputs plus the bias. Each neuron performs a very simple operation that involves activating if the total amount of signal received exceeds an activation threshold, as shown in the following figure:
The function of the entire neural network is simply the computation of the outputs of all the neurons, which is an entirely deterministic calculation. Essentially, ANN is a set of mathematical function approximations. We would now be introducing new terminology associated with ANNs:
Input layer
Hidden layer
Output layer
Weights
Bias
Activation functions
Any neural network processing a framework has the following architecture:
There is a set of inputs, a processor, and a set of outputs. This layered approach is also followed in neural networks. The inputs form the input layer, the middle layer(s) which performs the processing is called the hidden layer(s), and the output(s) forms the output layer.
Our neural network architectures are also based on the same principle. The hidden layer has the magic to convert the input to the desired output. The understanding of the hidden layer requires knowledge of weights, bias, and activation functions, which is our next topic of discussion.
Weights in an ANN are the most important factor in converting an input to impact the output. This is similar to slope in linear regression, where a weight is multiplied to the input to add up to form the output. Weights are numerical parameters which determine how strongly each of the neurons affects the other.
For a typical neuron, if the inputs are x1, x2, and x3, then the synaptic weights to be applied to them are denoted as w1, w2, and w3.
Output is
where i is 1 to the number of inputs.
Simply, this is a matrix multiplication to arrive at the weighted sum.
Bias is like the intercept added in a linear equation. It is an additional parameter which is used to adjust the output along with the weighted sum of the inputs to the neuron.
The processing done by a neuron is thus denoted as :
A function is applied on this output and is called an activation function. The input of the next layer is the output of the neurons in the previous layer, as shown in the following image:
Training is the act of presenting the network with some sample data and modifying the weights to better approximate the desired function.
There are two main types of training: supervised learning and unsupervised learning.
We supply the neural network with inputs and the desired outputs. Response of the network to the inputs is measured. The weights are modified to reduce the difference between the actual and desired outputs.
We only supply inputs. The neural network adjusts its own weights, so that similar inputs cause similar outputs. The network identifies the patterns and differences in the inputs without any external assistance.
One iteration or pass through the process of providing the network with an input and updating the network's weights is called an epoch. It is a full run of feed-forward and backpropagation for update of weights. It is also one full read through of the entire dataset.
Typically, many epochs, in the order of tens of thousands at times, are required to train the neural network efficiently. We will see more about epochs in the forthcoming chapters.
The abstraction of the processing of neural networks is mainly achieved through the activation functions. An activation function is a mathematical function which converts the input to an output, and adds the magic of neural network processing. Without activation functions, the working of neural networks will be like linear functions. A linear function is one where the output is directly proportional to input, for example:
A linear function is a polynomial of one degree. Simply, it is a straight line without any curves.
However, most of the problems the neural networks try to solve are nonlinear and complex in nature. To achieve the nonlinearity, the activation functions are used. Nonlinear functions are high degree polynomial functions, for example:
The graph of a nonlinear function is curved and adds the complexity factor.
Activation functions give the nonlinearity property to neural networks and make them true universal function approximators.
There are many activation functions available for a neural network to use. We shall see a few of them here.
The simplest activation function, one that is commonly used for the output layer activation function in neural network problems, is the linear activation function represented by the following formula:
The output is same as the input and the function is defined in the range (-infinity, +infinity). In the following figure, a linear activation function is shown:
A unit step activation function is a much-used feature in neural networks. The output assumes value 0 for negative argument and 1 for positive argument. The function is as follows:
The range is between (0,1) and the output is binary in nature. These types of activation functions are useful for binary schemes. When we want to classify an input model in one of two groups, we can use a binary compiler with a unit step activation function. A unit step activation function is shown in the following figure:
The sigmoid function is a mathematical function that produces a sigmoidal curve; a characteristic curve for its S shape. This is the earliest and often used activation function. This squashes the input to any value between 0 and 1, and makes the model logistic in nature. This function refers to a special case of logistic function defined by the following formula:
In the following figure is shown a sigmoid curve with an S shape:
Another very popular and widely used activation feature is the tanh function. If you look at the figure that follows, you can notice that it looks very similar to sigmoid; in fact, it is a scaled sigmoid function. This is a nonlinear function, defined in the range of values (-1, 1), so you need not worry about activations blowing up. One thing to clarify is that the gradient is stronger for tanh than sigmoid