Neural Networks with R - Giuseppe Ciaburro - E-Book

Neural Networks with R E-Book

Giuseppe Ciaburro

0,0
37,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Uncover the power of artificial neural networks by implementing them through R code.

About This Book

  • Develop a strong background in neural networks with R, to implement them in your applications
  • Build smart systems using the power of deep learning
  • Real-world case studies to illustrate the power of neural network models

Who This Book Is For

This book is intended for anyone who has a statistical background with knowledge in R and wants to work with neural networks to get better results from complex data. If you are interested in artificial intelligence and deep learning and you want to level up, then this book is what you need!

What You Will Learn

  • Set up R packages for neural networks and deep learning
  • Understand the core concepts of artificial neural networks
  • Understand neurons, perceptrons, bias, weights, and activation functions
  • Implement supervised and unsupervised machine learning in R for neural networks
  • Predict and classify data automatically using neural networks
  • Evaluate and fine-tune the models you build.

In Detail

Neural networks are one of the most fascinating machine learning models for solving complex computational problems efficiently. Neural networks are used to solve wide range of problems in different areas of AI and machine learning.

This book explains the niche aspects of neural networking and provides you with foundation to get started with advanced topics. The book begins with neural network design using the neural net package, then you'll build a solid foundation knowledge of how a neural network learns from data, and the principles behind it. This book covers various types of neural network including recurrent neural networks and convoluted neural networks. You will not only learn how to train neural networks, but will also explore generalization of these networks. Later we will delve into combining different neural network models and work with the real-world use cases.

By the end of this book, you will learn to implement neural network models in your applications with the help of practical examples in the book.

Style and approach

A step-by-step guide filled with real-world practical examples.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 261

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Neural Networks with R

 

 

 

 

 

 

Smart models using CNN, RNN, deep learning, and artificial intelligence principles

 

 

 

 

 

 

 

 

 

 

Giuseppe Ciaburro
Balaji Venkateswaran

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Neural Networks with R

 

Copyright © 2017 Packt Publishing

 

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

 

 

First published: September 2017

Production reference: 1220917

 

 

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78839-787-2

www.packtpub.com

Credits

Authors

Giuseppe Ciaburro

Balaji Venkateswaran

 

 

Copy Editors

Safis Editing

Alpha Singh

Vikrant Phadkay

Reviewer

Juan Tomás Oliva Ramos

Project Coordinator

Nidhi Joshi

 

 

 

Commissioning Editor

Sunith Shetty

Proofreader

Safis Editing 

Acquisition Editor

Varsha Shetty

Indexer

Mariammal Chettiyar

Content Development Editor

Cheryl Dsa

Graphics

Tania Dutta

Technical Editor

Suwarna Patil

Production Coordinator

Arvindkumar Gupta

About the Authors

Giuseppe Ciaburro; holds a master's degree in chemical engineering from Università degli Studi di Napoli Federico II, and a master's degree in acoustic and noise control from Seconda Università degli Studi di Napoli. He works at the Built Environment Control Laboratory of Università degli Studi della Campania "Luigi Vanvitelli".

He has over 15 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in Python and R, and he has extensive experience of working with MATLAB. An expert in acoustics and noise control, Giuseppe has wide experience in teaching professional computer courses (about 15 years), dealing with e-learning as an author. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He is currently researching machine learning applications in acoustics and noise control.

 

 

 

 

Balaji Venkateswaran is an AI expert, data scientist, machine learning practitioner, and database architect. He has 17+ years of experience in investment banking payment processing, telecom billing, and project management. He has worked for major companies such as ADP, Goldman Sachs, MasterCard, and Wipro. Balaji is a trainer in data science, Hadoop, and Tableau. He holds a postgraduate degree PG in business analytics from Great Lakes Institute of Management, Chennai.

Balaji has expertise relating to statistics, classification, regression, pattern recognition, time series forecasting, and unstructured data analysis using text mining procedures. His main interests are neural networks and deep learning.

Balaji holds various certifications in IBM SPSS, IBM Watson, IBM big data architect, cloud architect, CEH, Splunk, Salesforce, Agile CSM, and AWS.

If you have any questions, don't hesitate to message him on LinkedIn (linkedin.com/in/balvenkateswaran); he will be more than glad to help fellow data scientists.

"I would like to thank my parents and the three As in my life - wife Aruna, son Aadarsh and baby Abhitha who have been very supportive in this endeavor. I would also like to thank the staff of Packt publishers who were very helpful throughout the journey."

About the Reviewer

Juan Tomás Oliva Ramos is an environmental engineer from the university of Guanajuato, Mexico, with a master's degree in administrative engineering and quality. He has more than 5 years of experience in management and development of patents, technological innovation projects, and development of technological solutions through the statistical control of processes. He has been a teacher of statistics, entrepreneurship, and technological development of projects since 2011.  He became an entrepreneur mentor, and started a new department of technology management and entrepreneurship at instituto Tecnologico Superior de Purisima del Rincon.

Juan is a Alfaomega reviewer and has worked on the book Wearable designs for Smart watches, Smart TVs and Android mobile devices.

He has developed prototypes through programming and automation technologies for the improvement of operations, which have been registered for patents.

 

I want to thank God for giving me the wisdom and humility to review this book. I thank Packt for giving me the opportunity to review this amazing book and to collaborate with a group of committed people. I want to thank my beautiful wife, Brenda; our two magic princesses, Regina and Renata; and our next member, Angel Tadeo; all of you give me the strength, happiness, and joy to start a new day. Thanks for being my family.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review.

If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Neural Network and Artificial Intelligence Concepts

Introduction

Inspiration for neural networks

How do neural networks work?

Layered approach

Weights and biases

Training neural networks

Supervised learning

Unsupervised learning

Epoch

Activation functions

Different activation functions

Linear function

Unit step activation function

Sigmoid

Hyperbolic tangent

Rectified Linear Unit

Which activation functions to use?

Perceptron and multilayer architectures

Forward and backpropagation

Step-by-step illustration of a neuralnet and an activation function

Feed-forward and feedback networks

Gradient descent

Taxonomy of neural networks

Simple example using R neural net library - neuralnet()

Let us go through the code line-by-line

Implementation using nnet() library

Let us go through the code line-by-line

Deep learning

Pros and cons of neural networks

Pros

Cons

Best practices in neural network implementations

Quick note on GPU processing

Summary

Learning Process in Neural Networks

What is machine learning?

Supervised learning

Unsupervised learning

Reinforcement learning

Training and testing the model

The data cycle

Evaluation metrics

Confusion matrix

True Positive Rate

True Negative Rate

Accuracy

Precision and recall

F-score

Receiver Operating Characteristic curve

Learning in neural networks

Back to backpropagation

Neural network learning algorithm optimization

Supervised learning in neural networks

Boston dataset

Neural network regression with the Boston dataset

Unsupervised learning in neural networks 

Competitive learning

Kohonen SOM

Summary

Deep Learning Using Multilayer Neural Networks

Introduction of DNNs

R for DNNs

Multilayer neural networks with neuralnet

Training and modeling a DNN using H2O

Deep autoencoders using H2O

Summary

Perceptron Neural Network Modeling – Basic Models

Perceptrons and their applications

Simple perceptron – a linear separable classifier

Linear separation

The perceptron function in R

Multi-Layer Perceptron

MLP R implementation using RSNNS

Summary

Training and Visualizing a Neural Network in R

Data fitting with neural network

Exploratory analysis

Neural network model

Classifing breast cancer with a neural network

Exploratory analysis

Neural network model

The network training phase

Testing the network

Early stopping in neural network training

Avoiding overfitting in the model

Generalization of neural networks

Scaling of data in neural network models

Ensemble predictions using neural networks

Summary

Recurrent and Convolutional Neural Networks

Recurrent Neural Network

The rnn package in R

LSTM model

Convolutional Neural Networks

Step #1 – filtering

Step #2 – pooling

Step #3 – ReLU for normalization

Step #4 – voting and classification in the fully connected layer

Common CNN architecture - LeNet

Humidity forecast using RNN

Summary

Use Cases of Neural Networks – Advanced Topics

TensorFlow integration with R

Keras integration with R

MNIST HWR using R

LSTM using the iris dataset

Working with autoencoders

PCA using H2O

Autoencoders using H2O

Breast cancer detection using darch

Summary

Preface

Neural networks are one of the most fascinating machine learning models for solving complex computational problems efficiently. Neural networks are used to solve a wide range of problems in different areas of AI and machine learning.

This book explains the niche aspects of neural networking and provides you with the foundation to get started with advanced topics. The book begins with neural network design using the neuralnet package; then you'll build solid knowledge of how a neural network learns from data and the principles behind it. This book covers various types of neural networks, including recurrent neural networks and convoluted neural networks. You will not only learn how to train neural networks but also explore generalization of these networks. Later, we will delve into combining different neural network models and work with the real-world use cases.

By the end of this book, you will learn to implement neural network models in your applications with the help of the practical examples in the book.

What this book covers

Chapter 1, Neural Network and Artificial Intelligence Concepts, introduces the basic theoretical concepts of Artificial Neural Networks (ANN) and Artificial Intelligence (AI). It presents the simple applications of ANN and AI with usage of math concepts. Some introduction to R ANN functions is also covered.

Chapter 2, Learning Processes in Neural Networks, shows how to do exact inferences in graphical models and show applications as expert systems. Inference algorithms are the base components for learning and using these types of models. The reader must at least understand their use and a bit about how they work.

Chapter 3, Deep Learning Using Multilayer Neural Networks, is about understanding deep learning and neural network usage in deep learning. It goes through the details of the implementation using R packages. It covers the many hidden layers set up for deep learning and uses practical datasets to help understand the implementation.

Chapter 4, Perceptron Neural Network – Basic Models, helps understand what a perceptron is and the applications that can be built using it. This chapter covers an implementation of perceptrons using R.

Chapter 5, Training and Visualizing a Neural Network in R, covers another example of training a neural network with a dataset. It also gives a better understanding of neural networks with a graphical representation of input, hidden, and output layers using the plot() function in R.

Chapter 6, Recurrent and Convolutional Neural Networks, introduces Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN) with their implementation in R. Several examples are proposed to understand the basic concepts.

Chapter 7, Use Cases of Neural Networks – Advanced Topics, presents neural network applications from different fields and how neural networks can be used in the AI world. This will help the reader understand the practical usage of neural network algorithms. The reader can enhance his or her skills further by taking different datasets and running the R code.

What you need for this book

This book is focused on neural networks in an R environment. We have used R version 3.4.1 to build various applications and the open source and enterprise-ready professional software for R, RStudio version 1.0.153. We focus on how to utilize various R libraries in the best possible way to build real-world applications. In that spirit, we have tried to keep all the code as friendly and readable as possible. We feel that this will enable our readers to easily understand the code and readily use it in different scenarios.

Who this book is for

This book is intended for anyone who has a statistics background with knowledge in R and wants to work with neural networks to get better results from complex data. If you are interested in artificial intelligence and deep learning and want to level up, then this book is what you need!

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The line in R includes theneuralnet()library in our program."

Any command-line input or output is written as follows:

mydata=read.csv('Squares.csv',sep=",",header=TRUE)

mydata

attach(mydata)

names(mydata)

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "A reference page in theHelpbrowser."

Warnings or important notes appear in a box like this.
Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply [email protected], and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you. You can download the code files by following these steps:

Log in or register to our website using your email address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Neural-Networks-with-R. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Neural Network and Artificial Intelligence Concepts

From the scientific and philosophical studies conducted over the centuries, special mechanisms have been identified that are the basis of human intelligence. Taking inspiration from their operations, it was possible to create machines that imitate part of these mechanisms. The problem is that they have not yet succeeded in imitating and integrating all of them, so the Artificial Intelligence (AI) systems we have are largely incomplete.

A decisive step in the improvement of such machines came from the use of so-called Artificial Neural Networks (ANNs) that, starting from the mechanisms regulating natural neural networks, plan to simulate human thinking. Software can now imitate the mechanisms needed to win a chess match or to translate text into a different language in accordance with its grammatical rules.

This chapter introduces the basic theoretical concepts of ANN and AI. Fundamental understanding of the following is expected:

Basic high school mathematics; differential calculus and functions such as

sigmoid

R programming and usage of R libraries

We will go through the basics of neural networks and try out one model using R. This chapter is a foundation for neural networks and all the subsequent chapters.

We will cover the following topics in this chapter:

ANN concepts

Neurons, perceptron, and multilayered neural networks

Bias, weights, activation functions, and hidden layers

Forward and backpropagation methods

Brief overview of

Graphics Processing Unit

(

GPU

)

At the end of the chapter, you will be able to recognize the different neural network algorithms and tools which R provides to handle them.

Introduction

The brain is the most important organ of the human body. It is the central processing unit for all the functions performed by us. Weighing only 1.5 kilos, it has around 86 billion neurons. A neuron is defined as a cell transmitting nerve impulses or electrochemical signals. The brain is a complex network of neurons which process information through a system of several interconnected neurons. It has always been challenging to understand the brain functions; however, due to advancements in computing technologies, we can now program neural networks artificially.

The discipline of ANN arose from the thought of mimicking the functioning of the same human brain that was trying to solve the problem. The drawbacks of conventional approaches and their successive applications have been overcome within well-defined technical environments.

AI or machine intelligence is a field of study that aims to give cognitive powers to computers to program them to learn and solve problems. Its objective is to simulate computers with human intelligence. AI cannot imitate human intelligence completely; computers can only be programmed to do some aspects of the human brain.

Machine learning is a branch of AI which helps computers to program themselves based on the input data. Machine learning gives AI the ability to do data-based problem solving. ANNs are an example of machine learning algorithms.

Deep learning (DL) is complex set of neural networks with more layers of processing, which develop high levels of abstraction. They are typically used for complex tasks, such as image recognition, image classification, and hand writing identification.

Most of the audience think that neural networks are difficult to learn and use it as a black box. This book intends to open the black box and help one learn the internals with implementation in R. With the working knowledge, we can see many use cases where neural networks can be made tremendously useful seen in the following image:

Inspiration for neural networks

Neural networks are inspired by the way the human brain works. A human brain can process huge amounts of information using data sent by human senses (especially vision). The processing is done by neurons, which work on electrical signals passing through them and applying flip-flop logic, like opening and closing of the gates for signal to transmit through. The following images shows the structure of a neuron:

The major components of each neuron are:

Dendrites

: Entry points in each neuron which take input from other neurons in the network in form of electrical impulses

Cell Body

: It generates inferences from the dendrite inputs and decides what action to take

Axon terminals

: They transmit outputs in form of electrical impulses to next neuron

Each neuron processes signals only if it exceeds a certain threshold. Neurons either fire or do not fire; it is either 0 or 1.

AI has been a domain for sci-fi movies and fiction books. ANNs within AI have been around since the 1950s, but we have made them more dominant in the past 10 years due to advances in computing architecture and performance. There have been major advancements in computer processing, leading to:

Massive parallelism

Distributed representation and computation

Learning and generalization ability

Fault tolerance

Low energy consumption

In the domain of numerical computations and symbol manipulation, solving problems on-top of centralized architecture, modern day computers have surpassed humans to a greater extent. Where they actually lag behind with such an organizing structure is in the domains of pattern recognition, noise reduction, and optimizing. A toddler can recognize his/her mom in a huge crowd, but a computer with a centralized architecture wouldn’t be able to do the same.

This is where the biological neural network of the brain has been outperforming machines, and hence the inspiration to develop an alternative loosely held, decentralized architecture mimicking the brain.

ANNs are massively parallel computing systems consisting of an extremely large number of simple processors with many interconnections.

One of the leading global news agencies, Guardian, used big data in digitizing the archives by uploading the snapshots of all the archives they had had. However, for a user to copy the content and use it elsewhere is the limitation here. To overcome that, one can use an ANN for text pattern recognition to convert the images to text file and then to any format according to the needs of the end-users.

How do neural networks work?

Similar to the biological neuron structure, ANNs define the neuron as a central processing unit, which performs a mathematical operation to generate one output from a set of inputs. The output of a neuron is a function of the weighted sum of the inputs plus the bias. Each neuron performs a very simple operation that involves activating if the total amount of signal received exceeds an activation threshold, as shown in the following figure:

The function of the entire neural network is simply the computation of the outputs of all the neurons, which is an entirely deterministic calculation. Essentially, ANN is a set of mathematical function approximations. We would now be introducing new terminology associated with ANNs:

Input layer

Hidden layer

Output layer

Weights

Bias

Activation functions

Layered approach

Any neural network processing a framework has the following architecture:

There is a set of inputs, a processor, and a set of outputs. This layered approach is also followed in neural networks. The inputs form the input layer, the middle layer(s) which performs the processing is called the hidden layer(s), and the output(s) forms the output layer.

Our neural network architectures are also based on the same principle. The hidden layer has the magic to convert the input to the desired output. The understanding of the hidden layer requires knowledge of weights, bias, and activation functions, which is our next topic of discussion.

Weights and biases

Weights in an ANN are the most important factor in converting an input to impact the output. This is similar to slope in linear regression, where a weight is multiplied to the input to add up to form the output. Weights are numerical parameters which determine how strongly each of the neurons affects the other.

For a typical neuron, if the inputs are x1, x2, and x3, then the synaptic weights to be applied to them are denoted as w1, w2, and w3.

Output is

 

where i is 1 to the number of inputs.

Simply, this is a matrix multiplication to arrive at the weighted sum.

Bias is like the intercept added in a linear equation. It is an additional parameter which is used to adjust the output along with the weighted sum of the inputs to the neuron.

The processing done by a neuron is thus denoted as :

 

A function is applied on this output and is called an activation function. The input of the next layer is the output of the neurons in the previous layer, as shown in the following image:

Training neural networks

Training is the act of presenting the network with some sample data and modifying the weights to better approximate the desired function.

There are two main types of training: supervised learning and unsupervised learning.

Supervised learning

We supply the neural network with inputs and the desired outputs. Response of the network to the inputs is measured. The weights are modified to reduce the difference between the actual and desired outputs.

Unsupervised learning

We only supply inputs. The neural network adjusts its own weights, so that similar inputs cause similar outputs. The network identifies the patterns and differences in the inputs without any external assistance.

Epoch

One iteration or pass through the process of providing the network with an input and updating the network's weights is called an epoch. It is a full run of feed-forward and backpropagation for update of weights. It is also one full read through of the entire dataset.

Typically, many epochs, in the order of tens of thousands at times, are required to train the neural network efficiently. We will see more about epochs in the forthcoming chapters.

Activation functions

The abstraction of the processing of neural networks is mainly achieved through the activation functions. An activation function is a mathematical function which converts the input to an output, and adds the magic of neural network processing. Without activation functions, the working of neural networks will be like linear functions. A linear function is one where the output is directly proportional to input, for example:

 

A linear function is a polynomial of one degree. Simply, it is a straight line without any curves.

However, most of the problems the neural networks try to solve are nonlinear and complex in nature. To achieve the nonlinearity, the activation functions are used. Nonlinear functions are high degree polynomial functions, for example:

 

The graph of a nonlinear function is curved and adds the complexity factor.

Activation functions give the nonlinearity property to neural networks and make them true universal function approximators.

Different activation functions

There are many activation functions available for a neural network to use. We shall see a few of them here.

Linear function

The simplest activation function, one that is commonly used for the output layer activation function in neural network problems, is the linear activation function represented by the following formula:

 

The output is same as the input and the function is defined in the range (-infinity, +infinity). In the following figure, a linear activation function is shown:

Unit step activation function

A unit step activation function is a much-used feature in neural networks. The output assumes value 0 for negative argument and 1 for positive argument. The function is as follows:

 

 

The range is between (0,1) and the output is binary in nature. These types of activation functions are useful for binary schemes. When we want to classify an input model in one of two groups, we can use a binary compiler with a unit step activation function. A unit step activation function is shown in the following figure:

Sigmoid

The sigmoid function is a mathematical function that produces a sigmoidal curve; a characteristic curve for its S shape. This is the earliest and often used activation function. This squashes the input to any value between 0 and 1, and makes the model logistic in nature. This function refers to a special case of logistic function defined by the following formula:

 

In the following figure is shown a sigmoid curve with an S shape:

Hyperbolic tangent

Another very popular and widely used activation feature is the tanh function. If you look at the figure that follows, you can notice that it looks very similar to sigmoid; in fact, it is a scaled sigmoid function. This is a nonlinear function, defined in the range of values (-1, 1), so you need not worry about activations blowing up. One thing to clarify is that the gradient is stronger for tanh than sigmoid