33,59 €
Exploit TensorFlow's capabilities to build artificial intelligence applications
Artificial Intelligence (AI) is a popular area with an emphasis on creating intelligent machines that can reason, evaluate, and understand the same way as humans. It is used extensively across many fields, such as image recognition, robotics, language processing, healthcare, finance, and more.
Hands-On Artificial Intelligence with TensorFlow gives you a rundown of essential AI concepts and their implementation with TensorFlow, also highlighting different approaches to solving AI problems using machine learning and deep learning techniques. In addition to this, the book covers advanced concepts, such as reinforcement learning, generative adversarial networks (GANs), and multimodal learning.
Once you have grasped all this, you’ll move on to exploring GPU computing and neuromorphic computing, along with the latest trends in quantum computing. You’ll work through case studies that will help you examine AI applications in the important areas of computer vision, healthcare, and FinTech, and analyze their datasets. In the concluding chapters, you’ll briefly investigate possible developments in AI that we can expect to see in the future.
By the end of this book, you will be well-versed with the essential concepts of AI and their implementation using TensorFlow.
Hands-On Artificial Intelligence with TensorFlow is for you if you are a machine learning developer, data scientist, AI researcher, or anyone who wants to build artificial intelligence applications using TensorFlow. You need to have some working knowledge of machine learning to get the most out of this book.
Amir Ziai is a senior data scientist at Netflix, where he works on streaming security involving petabyte-scale machine learning platforms and applications. He has worked as a data scientist in AdTech, HealthTech, and FinTech companies. He holds a master's degree in data science from UC Berkeley. Ankit Dixit is a deep learning expert at AIRA Matrix in Mumbai, India and having an experience of 7 years in the field of computer vision and machine learning. He is currently working on the development of full slide medical image analysis solutions in his organization. His work involves designing and implementation of various customized deep neural networks for image segmentation as well as classification tasks. He has worked with different deep neural network architectures such as VGG, ResNet, Inception, Recurrent Neural Nets (RNN) and FRCNN. He holds a masters degree in computer vision specialization. He has also authored an AI/ML book.Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 622
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editors: Veena NaikAcquisition Editor: Divya PoojariContent Development Editors: Rhea HenriquesTechnical Editor: Dinesh ChaudharyCopy Editor:Safis EditingProject Coordinator: Manthan PatelProofreader: Safis EditingIndexers: Rekha NairGraphics: Jisha ChirayilProduction Coordinator: Shantanu Zagade
First published: October 2018
Production reference: 1301018
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78899-807-9
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Amir Ziai is a senior data scientist at Netflix, where he works on streaming security involving petabyte-scale machine learning platforms and applications. He has worked as a data scientist in AdTech, HealthTech, and FinTech companies. He holds a master's degree in data science from UC Berkeley.
Ankit Dixit is a deep learning expert at AIRA Matrix in Mumbai, India and having an experience of 7 years in the field of computer vision and machine learning. He is currently working on development of full slide medical image analysis solutions in his organization. His work involves designing and implementation of various customized deep neural networks for image segmentation as well as classification tasks. He has worked with different deep neural network architectures such as VGG, ResNet, Inception, Recurrent Neural Nets (RNN) and FRCNN. He holds a masters degree in computer vision specialization. He has also authored an AI/ML book.
Luca Massaron is a data scientist and marketing research director specialized in multivariate statistical analysis, machine learning, and customer insight, with over a decade of experience of solving real-world problems and generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of web audience analysis in Italy to achieving the rank of a top-10 Kaggler, he has always been very passionate about every aspect of data and analysis, and also about demonstrating the potential of data-driven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, Luca believes that a lot can be achieved in data science just by doing the essentials.
Marvin Bertin is an online course author and technical book editor focused on deep learning, computer vision, and NLP with TensorFlow. He holds a bachelor's in mechanical engineering and a master's in data science. He has worked as a machine learning engineer and a data scientist in the Bay Area, focusing on recommender systems, NLP, and biotech applications. He currently works at a start-up that develops deep learning algorithms for early cancer detection.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Hands-On Artificial Intelligence with TensorFlow
Packt Upsell
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Conventions used
Get in touch
Reviews
Overview of AI
Artificial intelligence
A brief history of AI
The importance of big data
Basic terms and concepts
Understanding ML, DL, and NN
Algorithms
Machine learning (ML)
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Neural network
Deep learning
Natural language processing (NLP)
Transfer learning
Black box
Autonomous
Applications of AI
Recent developments in AI
Deep reinforcement learning
RNNs and LSTMs
Deep learning and convolutional networks
Image recognition
Natural language processing (NLP)
Autonomous driving
Limitations of the current state of AI
Towards artificial general intelligence (AGI)
AGI through reinforcement learning
AGI through transfer learning
Recursive cortical networks (RCNs)
The fusion of AI and neuroscience
Is singularity near?
What exactly is singularity?
The Turing test
When is this supposed to happen?
Summary
TensorFlow for Artificial Intelligence
The basics of tensorflow
Tensors
Operations
Graphs
Sessions
Placeholders
Variables
Perceptron
Learning the Boolean OR operation
Using the sign function
Introducing a classifier
The perceptron learning algorithm
Perfect separation
Linear separability
Summary
Approaches to Solving AI
Learning types
Supervised learning
Unsupervised learning
Reinforcement learning
Multilayer perceptrons
High-level APIs
Keras
Estimators
Case study – wide and deep learning
TensorBoard
The bias-variance tradeoff
Architectures
Feedforward networks
Convolutional neural networks
Recurrent neural networks
Tuning neural networks
Pipeline
Exploratory data analysis
Preprocessing
Training, test, and validation sets
Metrics
Establishing a baseline
Diagnosing
Optimizing hyperparameters
Grid search
Randomized search
AutoML
Summary
Computer Vision and Intelligence
Computer vision
A step toward self-driving cars
Image classification
Neural network
Convolutional neural networks
Local receptive fields
Shared weights and biases
Pooling layers
 Combining it all together
MNIST digit classification using CNN
 Self-driving cars
Training phase
Data pre-processing
SelfDriveNet-CNN
Self-Drive Net-training
Self-Drive Net – Testing
Summary
Computational Understanding of Natural Language
Sequence data
Representation
Terminology
The preprocessing pipeline
Segmentation
Tokenization
Sequence of tokens
The bag-of-words model
Normalization
Bringing it all together
CountVectorizer
TfidfVectorizer
spaCy
Modeling
Traditional machine learning
Feedforward neural networks
Embeddings
Recurrent neural networks
LSTM
Bidirectional LSTMs
RNN alternatives
1D ConvNets
The best of both worlds
RNN + 1D convnets
Sequence-to-sequence models
Machine translation
Summary
Reinforcement Learning
The need for another learning paradigm
The exploration-exploitation dilemma
Shortcomings of supervised learning
Time value of rewards
Multi-armed bandits
Random strategy
The epsilon-greedy strategy
Optimistic initial values
Upper confidence bound
Markov decision processes (MDPs)
Value iteration
Q-Learning
Deep Q-Learning
Lunar lander
Random lunar lander
Deep Q-learning lunar lander
Summary
Generative Adversarial Networks
GANs
Implementation of GANs
Real data generator
Random data generator
Linear layer
Generator
Discriminator
GAN
Keep calm and train the GAN
GAN for 2D data generation
MNIST digit dataset
DCGAN
Discriminator
Generator
Transposed convolutions
Batch normalization
GAN-2D
Training the GAN model for 2D data generation
GAN Zoo
BiGAN – bidirectional generative adversarial networks
CycleGAN
GraspGAN – for deep robotic grasping
Progressive growth of GANs for improved quality
Summary
Multimodal Learning
What is multimodal learning?
Multimodal learning
Multimodal learning architecture
Information sources
Feature extractor
Aggregator network
Image caption generator
Dataset
Visual feature extractor
Textual feature extractor
Aggregator model
Caption generator architecture
Importing the libraries
Visual feature extraction
Text processing
Data preparation for training and testing
Caption generator model
Word embedding
Summary
From GPUs to Quantum computing - AI Hardware
Computers – an ordinary tale
A brief history
Central Processing Unit
CPU for machine learning
Motherboard
Processor
 Clock speed
Number of cores
Architecture
RAM
HDD and SSD
Operating system (OS)
Graphics Processing Unit (GPU)
GP-GPUs and NVIDIA CUDA
cuDNN
 ASICs, TPUs, and FPGAs
ASIC
TPU
Systolic array
Field-programmable gate arrays
Quantum computers
Can we really build a quantum computer?
How far are we from the quantum era?
Summary
TensorFlow Serving
What is TensorFlow Serving?
Understanding the basics of TensorFlow Serving
Servables
Servable versions
Models
Sources
Loaders
Aspired versions
Managers
Installing and running TensorFlow Serving
Virtual machines
Containers
Installation using Docker Toolbox
Operations for model serving
Model creation
Saving the model
Serving a model
What is gRPC?
Calling the model server
Running the model from the client side
Summary
AI Applications in Healthcare
The current status of AI in healthcare
The challenges to AI in healthcare
The applications of AI in healthcare 
Disease identification – breast cancer
Dataset
Exploratory data analysis (EDA)
Feature selection
Building a classifier
TensorFlow Estimators
DNN using TensorFlow
Human activity recognition
Dataset
Sensor signals
Feature extraction
Exploratory data analysis
Data preparation
Classifier design
Classifier training
Summary
References
AI Applications in Business
AI applications in business
Blockchain
Centralized ledger
Distributed or decentralized ledger
Miners
Bitcoin
Blockchain and AI
Blockchain and AI use cases
Open market for data
Large-scale data management mechanism
More trustworthy AI modeling and predictions
Control over the usage of data and models
Algorithmic trading
Cryptocurrency price prediction
Dataset
Simple lag model for Bitcoin price prediction
LSTM for Bitcoin price prediction
Pre-processing data 
Building the LSTM classifier
Training the LSTM classifier
Fraud detection
 Credit card fraud detection using autoencoders
AEs
Anomaly detection using AEs
Dataset
AE architecture
Training the AE
Risk management
Credit card default detection
Dataset
Building the classifier
Classifier training
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
Artificial intelligence (AI) is a popular area that focuses on creating intelligent machines to reason, evaluate, and understand the same way as humans. It is used extensively across many fields, such as image recognition, robotics, language processing, healthcare, finance, and more.
Hands-On Artificial Intelligence with TensorFlow gives you a rundown of essential AI concepts and their implementation with TensorFlow, also highlighting different approaches to solving AI problems using machine learning and deep learning techniques. In addition to this, the book covers advanced concepts such as reinforcement learning, generative adversarial networks (GANs), and multimodal learning.
Once you have grasped all this, you'll move on to exploring GPU computing and neuromorphic computing, along with the latest trends in quantum computing. You'll work through case studies, which will help you examine AI applications in important areas of computer vision, healthcare, and FinTech, and analyze datasets. In the concluding chapters, you'll briefly investigate possible developments in AI we can expect to see in the future.By the end of this book, you will be well-versed in the essential concepts of AI and their implementation using TensorFlow.
Hands-On Artificial Intelligence with TensorFlow is for you if you are a machine learning developer, data scientist, AI researcher, or anyone who wants to build AI applications using TensorFlow. You need to have some working knowledge of machine learning to get the most out of this book.
Chapter 1, Overview of AI, takes a look at the brief history of AI and compares it to its current self. We will learn about the core concepts of this field. Recent developments and the future of the technology will also be discussed.
Chapter 2, TensorFlow for Artificial Intelligence, introduces you to the TensorFlow library. We will learn how to use it to create a neural network. Other concepts introduced include training a neural network so that it makes predictions; TensorBoard – a visualization library that helps plot data based on the performance of our networks; and Keras – a low-level API that uses TensorFlow to simplify neural network development.
Chapter 3, Approaches to Solving AI, introduces neural networks and its various types, such as autoencoders, generative adversarial networks, Boltzmann machines, and Bayesian inference.
Chapter 4, Computer Vision and Intelligence, talks about the uses of the deep learning technology for computer vision tasks such as image classification. The basic concepts of designing a convolutional neural network and using it for image classification will be used to implement a self-driving car.
Chapter 5, Computational Understanding of Natural Language, covers natural language processing, recurrent neural networks, and long short-term memory (LSTM) networks. Understanding how to use these networks for text and speech processing is also something that will be discussed, along with sentiment analysis and machine translation.
Chapter 6, Reinforcement Learning, explores the concepts and applications of reinforcement learning, including exploration and exploitation. We will apply TensorFlow reinforcement learning algorithms to various real-world datasets to solve various problems. Applications include game theory, operations research, multi-agent systems, swarm intelligence, and genetic algorithms.
Chapter 7, Generative Adversarial Networks, talks about extending the capabilities of convolutional neural networks (CNNs), which we learned about in Chapter 4, Computer Vision and Intelligence. This will be done by using them to create synthetic images. We will learn how to use a simple CNN to generate images. We will see various types of GANs and their applications.
Chapter 8, Multimodal Learning, introduces the concepts of fusing information from two different domains and using it for interesting applications. We will fuse text and visual information to build an image caption generator that can create a textual description by analyzing an image's content.
Chapter 9, From GPUs to Quantum computing – AI Hardware, discusses the different hardware that can be used for AI application development. We will start with CPUs and extend our topic toward GPUs, ASICs, and TPUs. We will see how hardware technology has evolved along with the software technology.
Chapter 10, TensorFlow Serving, explains how to deploy our trained models on servers so that the majority of people can use our solutions. We will learn about TensorFlow Serving and we will deploy a very simple model on a local server.
Chapter 11, AI Applications in Healthcare, talks about use of AI in healthcare. We will see various technological advancements that will help to provide better healthcare solutions. We will see how we can train a neural network that can detect breast cancer. We will implement a neural network that can track a user's activity by action recognition, using data from smartphone sensors.
Chapter 12, AI Applications in Healthcare, discusses various applications of AI in business. We will also learn about blockchain, which is a state of the art decentralized ledger for recording digital transactions. Then we will learn about algorithmic trading and use it to predict the Bitcoin price. We will see how autoencoders can be used for credit card fraud detection, and we will also learn about how AI can help risk management by identifying potential risk by analyzing credit card transactions.
You should be able to install various Python packages. The code used in this book is written in Python 3.6, which will be compatible with any Python 3.X. If you are using Python 2.X then you may face some errors during execution.
This book uses TensorFlow GPU version 1.8 to build neural networks. TensorFlow can only run with Python 3.X on Windows machines.
This book also contain code written using the Keras API. It uses TensorFlow at the backend. We are using Keras 2.2 to build the machine learning models.
I suggest you use Anaconda for Python package management: it will help you to install and uninstall packages quite easily.
For dataset-related operations, we are using pandas 0.23.0, NumPy 1.14.3, and scikit-learn 0.19.1.
You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packtpub.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Hands-On-Artificial-Intelligence-with-TensorFlow. We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
Year 2050, Mars. NASA scientists have successfully established their 25th work station on the east of the planet. Russia, which has already occupied the west of Mars, is becoming increasingly worried. China is launching their third work station on the same land. The mounting tension is leading to another world war, which may be fought on Mars' soil. However, no man has ever landed on the surface of Mars! In this war, there will be no human casualties, there will be no bloodshed. The war will be fought by humanoid bots only, using artificial intelligence (AI).
Sounds like an interesting story? It is inspired by fantasy-based movies, where machines build the ultimate civilization. This civilization would be almost indestructible and made up of super-talented, ageless creatures. These machines can be deployed and programmed to carry out any task, but they can also make their own decisions too, which implies that they are conscious. Plenty of movies and TV series based on subjects such as these have been made.
What do you think – is it possible? Are we, as humans, capable of making such machines in the future? As more and more advancements are made in the technological domain, some of this may now be partially possible; the rest is, as yet, a mystery.
In this chapter, we will take a closer look at the following topics:
The different aspects of AI
The evolution of AI
Future expectations of the technology
Since the creation of machines, their ability to perform tasks has grown exponentially. Nowadays, computers are used in almost every field. They get quicker and perform better as the days go by, only to be accompanied by their ever-decreasing physical size.
AI is a wide field that consists of various subfields, techniques, and algorithms. Scientists are striving to create an AI where the machine is as smart as a human being, which, frankly, is also a possibility. Back in 1956, various great minds from the fields of mathematics and computer science were at Dartmouth with one goal in mind: to program computers in such a manner that they would begin to behave like humans. This was the moment AI was born.
The first computer scientist to use the term AI was John McCarthy, who was a specialist in cognitive science. According to him, AI is a way of making a computer or a software think intelligently, in a manner similar to intelligent humans. AI is related to how a human brain thinks, how it learns, and how we make decisions. We can develop intelligent software systems on the basis of this information.
AI has two main goals:
Creating expert-level systems:
To fabricate systems that possess behavior similar to humans and also possess their intelligence. This helps them learn, demonstrate, explain, and advise their users.
Implementing human intelligence in machines
: To create systems that can understand, think, learn, and behave like humans
We already have some computer software that is highly intelligent, such as IBM Watson. This is an AI-based computer program that learns through information from multiple sources such as vision, text, and audio. Watson is currently helping people by providing legal advice and has recently outperformed human doctors in the detection of cancer. This shows that we are moving towards both goals successfully.
As we have discussed, an AI system should be capable of learning behavior. It should be based on one or more of the following areas:
We can categorize AI algorithms in six main categories. It is possible to further categorize AI algorithms into more categories, but the following six will cover most of them:
Machine learning:
This is a subfield of AI that refers to a computer's ability to grasp and learn without being programmed
exactly.
Search and optimization:
This deals with algorithms for the optimization of functions. An example of this is gradient descent, which recursively searches for local maximums or minimums.
Logical reaso
ning:
An example of logical reasoning would be any computer system that has the ability to mimic the
decision making ability of
human beings.
Probabilistic reasoning:
This refers to the ability to combine the power of probability theory to manage uncertainty with the power of
inferential
logic to utilize the structure of a formal argument.
The
outcome
is a
more productive
and more expressive formalism with a
wide
range of
potential
application
fields
.
Control theory:
This refers to
controllers that have provable properties. It
normally
refers to
a system that contains differential equations that helps
identify
a physical system, such as a robot or an aircraft.
If you believe that AI is an emerging technology that has developed in recent years, you would be mistaken. AI actually has a long history, which dates back to 400 BC. The four foundations of AI-based thinking were philosophy, mathematics, neuroscience, and economics. After this, of course, it was included in the domain of computer science. The fact that AI is the derivative of so many fields is what makes it so interesting. Let’s take a look at a brief timeline of its evolution:
400 BC
: Philosophers began to consider the human mind as a machine.
They believed it was able to conceive, learn, and encode knowledge.
250 BC
: Ktesibios (Ctesibius) of Alexandria created a water clock. It was used to maintain a constant flow rate with the help of a regulator. It is possible that this was the first
self-controlling machine
. T
his would turn out to be a
n important discovery, as before this, any change in a system needed to be performed by a person.
250 BC to the 18th century
: This was quite a quiet period from the perspective of AI. During this time, da Vinci designed a calculator. At the same time, new words such as rationalism, dualism, materialism, empiricism, induction, and confirmation theory came into use.
These concepts were very important, but went against traditional thought at the time
.
Concepts such as decision theory and game theory came far later. Decision theory deals with the study of various strategies that help you to make optimal decisions, keeping various risk and gain factors in mind.
Neuroscience
came onto the scene around the same time.
The 19th century
:
Both psychology and mathematics started making progress in the nineteenth century. This was an important time in the history of AI. In the nineteenth century also came the development of Boolean logic, which refers to the mathematical function for solidifying thoughts that came from the domain of philosophy. AI and computer science found much steadier ground due to this discovery
. Another important development
was the term cognitive psychology,
which views the brain as an information-processing device
.
The 20th century
: In 1940, Alan Turing and his team created the first
operational
computer. It was designed to decrypt German messages. A year later, the first
programmable
computer was built in Germany. Around the same time, the first
electronic
computer was built.
This would be a particularly important moment in the history of mankind but was interrupted by the outbreak of the Second World War
.
The late 40s brought the
idea of artificially intelligent machines to the public. Many scientists were also working in the field that would later be known as
control theory
.
The focus was to
design systems that could reduce errors using the difference between the current output and the target output. Eventually, control theory and AI split ways, even though their founders had quite close connections in their beginnings.
1943–1955
: By this time, functions could be computed by a graph or a network made up of connected
neurons
. Furthermore, the logical operations, such as
and
,
or
, or
not
, could be implemented simply with the help of these networks.
In 1952, the program that would learn to play checkers came to be. This proved the simple fact that computers could go beyond doing things that they were precisely told to do.
1956–1966
: The summer of 1956 saw a 2-month workshop that was held by John McCarthy, a computer scientist at Dartmouth. The invitees included Trenchard More from Princeton,
Ray Solomonoff and Oliver Selfridge from MIT,
Arthur Samuel from IBM, Allen Newell Herbert Simon from Carnegie Tech, which would later be known as Carnegie Mellon, Marvin Minsky, Claude Shannon, and Nathaniel Rochester. After the conference, the research group at Dartmouth released some of the ideas discussed during that workshop. The governments started to reserve funds to promote the study of AI.
The early neural network experiments conducted by McuClloch and Pitts were improved and expanded, creating the building blocks of modern AI, which we now know as
perceptrons
(binary classifiers).
1966–1973
: AI saw its first setback during the period from 1966 to 1973. People came to the realization that perceptrons couldn’t be trained to recognize that two inputs were different in a multi-input perceptron. The answer to this problem was quite simple: they just needed to add more perceptrons and rearrange them. This insight would come in the eighties.
1969-1980
: Past attempts to resolve problems using AI had the tendency to be generalized and on a small scale. The real-world problems were difficult and the AI computers were not mature enough to deal with them
. These approaches were then called weak methods.
1986 to present
: During this period, backpropagation was developed. This refers to a way of computing the gradient at a different layer and is still in use today
. It helped systems to become more accurate and able to learn.
People started working more on existing theories than proposing new ones. AI had officially adopted scientific methods to solve problems, which allowed researchers to replicate
the
results of others.
In general, this was seen to be a good thing because new approaches that were repeatable would be taken more seriously
.
In the 90s, the world saw ever-improving versions of AI that could carry out different tasks. We began teaching cars to drive and have been improving on this ever since. Recommendation systems have become common methods for measuring customer satisfaction and the word
bot
entered common
vocabulary
. Speech recognition also started improving and AI systems learned to play chess and other games.
The tricky thing about AI is that if you try to train a model with around 10,000 images, for example, the performance might not be very good. However, if you use 10,000,000 images during training, the performance of your classifier will probably be better than that of a human mind. An example of this is the MNIST digit classification dataset (http://yann.lecun.com/exdb/mnist/).This shows how important big data is to AI.
The graph was created by CBInsights Trends (https://www.cbinsights.com/research/trends/) and basically plots the trends for the usage of specific words or themes. We measured the trends for AI and machine learning. It can be observed thatAI started to receive wide recognition by the end of 2012:
The vertical line indicates the real trigger of this new AI era, which was December 4, 2012. A group of researchers presented a whole new neural network architecture at the Neural Information Processing Systems (NIPS) conference. This new architecture was convolutional neural networks, which led them to win the first place in the ImageNet Classification competition (Krizhevsky et al., 2012). This work greatly improved the classification algorithm. The accuracy increased from 72% to 85%. From this point, the use of neural networks became fundamental to AI.
Within 2 years, the advancements in this field brought a drastic change in the accuracy of the classification algorithm in the ImageNet contest, where it peaked at 96% in 2014; this is only slightly lower than the human level of accuracy that is approximately 95%.
Let me ask you a simple question: what is an unsupervised machine learning algorithm? You're likely to ask what this has to do with AI. You might also ask what an algorithm is, what I mean by unsupervised, and even what machine learning is. This happens to every beginner in the AI domain. People get confused between different terms and then mix them together, changing the meaning of the technology. Let's take a look at the key concepts of AI: machine learning (ML), deep learning (DL) and neural networks (NN).
ML and DL are partially different from AI. The following figure will give you more clarity about these terms:
In the figure, ML algorithms are a subset of AI algorithms. Similarly, DL algorithms are a subset of ML algorithms, and DL is based on NNs. NNs are the heart and soul of all of these terms. They do exactly what we just discussed: learn the relationship between inputs and outputs.
For example, they may map how someone's height and weight relate to the likelihood of them playing basketball. The numbers that make up our inputs and outputs are arranged in vectors, which are rows of numbers. In conclusion, yes, AI is different from ML and DL, but only partially. Almost all AI applications use different kinds of NNs.
To say that AI was built on algorithms is not an understatement. This leads us to ask, what exactly are algorithms?
Algorithms are mathematical formulas and/or programming commands. They help a computer figure out how to eradicate problems using AI. Algorithms are nothing but a set of rules that help us teach computers how to figure different things out on their own.
It may look like a lot of numbers and commands, and people are often put off due to the large amount of mathematics and decision theory algorithms involved.
Let's briefly discuss ML, DL, and NNs.
ML is the backbone of AI. In certain cases, we can use the terms AI and machine learning interchangeably. They aren't quite the same, but are deeply connected.
ML is the process where a computer uses algorithms to perform AI functions. Basically, it is the result of applying different rules to create different outcomes through an AI system.
ML consists of four main types of algorithms: supervised learning algorithms, unsupervised learning algorithms, semi-supervised learning (SSL) algorithms, and reinforcement learning algorithms. Let’s discuss each of these in turn.
Training an AI model using a supervised learning method is like asking a kid to take a test but giving them all the answers to the questions beforehand. We provide the machine with the correct answers. This implies that the machine knows the question and its corresponding answer at the same time.
This is a common method of training a system, as it defines patterns between the question and answer. When new data comes without an answer, the machine just tries to map the answer that most correlates to a similar question that it has mapped previously.
The frightening part of AI research is understanding that machines are capable of learning and they use layers upon layers of data and processing capability to perform any task. When it comes to unsupervised learning, we don’t give the answer to the system. We want the machine to find the answer for us. For example, if we are looking into why somebody would choose one brand over another, we simply feed a machine a bunch of data, so that it can find whichever patterns lie in the data.
SSL is another group of AI methods that have become quite popular in the past few months. Conceptually, semi-supervised learning can be considered as a halfway point between unsupervised and supervised learning models. An SSL problem starts with a series of labelled data points and some data points for which labels are not known. The goal of an SSL model is to classify the unlabeled data using the labelled information provided.
AI bears more resemblance to humans than one might think. AI learns in a manner similar to humans. One method for teaching a machine is to use reinforcement learning. This involves giving the AI a goal related to a specific metric, such as telling it to improve its efficiency or find a solution to the problem. Instead of finding one specific answer, the AI will run scenarios and report different results, which are then evaluated by humans, who judge whether they are correct or not. The AI takes the feedback and adjusts the next cycle to achieve better results.
We can create an NN when we want to improve AI. These networks have a lot of similarities with the human nervous system and the brain. It uses different stages of learning to give AI the ability to solve complex problems by breaking them into multiple levels of data. The first level of the network may only be concerned with a few pixels in an image file. Once the initial stage is completed, the neural network will pass the information collected to the next level, which will try to understand a few more pixels and perhaps some underlying structures. This process continues on different levels of the neural network.
DL is what happens when a neural network gets to work. As a set of data is processed, the AI gains a basic understanding about that data. If you are teaching your AI system to understand about cats, for example, it might divide the task into different subtasks. A certain part of the network will learn about cat paws, and another part will learn their facial features, and so on. DL is also considered a hierarchical learning method.
As the AI domain is quite large, there are some more terms that often confuse people, especially when you are dealing with more advanced concepts. Let’s cover these terms briefly.
An advanced neural network is required to understand human language. When an AI is trained to interpret human communication, this process is known as natural language processing (NLP). This is very useful for chat bots and translation services, but it’s also represented at the cutting edge by AI assistants such as Amazon's Alexa and Apple's Siri.
Machines can also learn through transfer learning. Once an AI system has successfully learned something, such as how to determine whether an image is a cat or not, it starts with basic feature understanding, which doesn't necessarily need to be about cats. These features are known as generalized features and can be used for the initialization of a different classifier, such as for a dog classifier. This can save a lot of training time because the dog classifier doesn't have to learn generalized features.
When the different rules are applied to an AI system, it carries out a lot of complex mathematics. This is generally beyond the understanding of humans, but the output attained is useful. When this happens, it’s called black box learning. We are more interested in the results the machine arrives at than how it got there.
In simple words, autonomy means an AI system doesn’t need help from people. This term originated primarily from the development of driverless cars. There are different levels of autonomy. Basic intelligence systems are considered between autonomy levels one to three.
A vehicle that does not require a steering wheel or pedals is classified as being at autonomy level four. This simply implies that the vehicle doesn’t require a human to control the vehicle at its full capacity. Autonomy level five would be when a vehicle does not require any kind of help from external sources such as servers or GPS.
Anything beyond that would be called an awake machine, with consciousness. Despite the significant progress in the field of AI, singularity, which is a term used to describe an AI system that has become self-aware, is just theoretical.
Right now, AI is dominating most technological domains. Apple’s Siri, Microsoft’s Cortana, or Google’s Google Assistant are some well-known applications of AI. There are numerous other AI applications, a few of which are listed as follows:
Gaming
: AI plays a very important role when it comes to strategic games such as chess, poker, tic-tac-toe, and so on.
Natural language processing
: The possibility to interact with a computer is now a reality because it understands natural language, which is the language spoken by humans.
Expert systems
:
Applications can integrate a machine, some software, and some particular information to impart reasoning and advice to the user in various different domains.
Vision systems
: These kinds of systems help understand, interpret, and comprehend visual inputs on a computer. Examples can range from images taken by a drone to clinical expert diagnosis.
Speech recognition
: There are also intelligent systems that are capable of hearing and converting language to sentences. They can understand a human's meaning and even interpret accents, slang words, any noise in the background, the change in a human’s accent due to a cold, and so on.
Handwriting recognition
: Handwriting recognition software helps to read text written on paper by a pen or on screen by a stylus by recognizing the shape and the alignment of letters.
Intelligent robots
: Robots are able to perform different tasks ordered by a human. Their sensors allow them to detect physical data from the real world. They are also capable of learning from their mistakes and can adapt to a new environment.
Healthcare
: AI has now been successfully deployed in the healthcare domain as well. It can assist doctors in identifying diseases such as cancer or other tumors. Wearable devices also have pretty good hardware sensors and can be utilized as your personal health companion or to help you to control your calorie intake.
FinTech
: Nowadays, AI is helping people by providing financial advice. Software can now be used to predict different stock prices or market trends. Another interesting area is fraud detection, where AI is used to identify fraudulent transactions from credit cards. Risk management is a further area in which AI can be useful.
Vehicles
: The autonomous car, STANLEY, finished a 132-mile race in 2005 all by itself. Similarly, OTTO, a self-driving truck company, delivered its first order in 2016 based on AI. Tesla is a prime example of a company that has successfully harnessed AI technology. They recently landed a rocket on a dinghy in the ocean after sending several satellites into space.
Planning and scheduling
: NASA built a system that could control the scheduling of operations on a spacecraft in the early 2000s. Similarly, Google Assistant is an example of an AI-capable system that can help you schedule meetings with other people.
Remember, these are only some AI applications. There are many more that you might use, depending on what task you want to solve using algorithms.
As discussed earlier, almost all of the listed applications are based on different types of NNs. In the following section, we'll look at an overview of the different kinds of neural networks. We will implement most of these in future chapters for different applications.
AI gradually progressed from a concept to a successfully running system. We have seen that most progress in this field has happened since 2012, after the introduction of DL technology, specifically convolutional NNs. After the introduction of DL, most subsequent developments have been based on similar frameworks. In this section, we will focus on some important research areas where AI really has become a game changer. We will discuss the following topics:
Deep reinforcement learning (DRL)
RNN and LSTM
DL and convolutional networks
Autonomous driving
DRL is an exciting area of modern AI research and the fact that it is applicable to a number of problem areas has made it rather useful. Most scientists look at DRL as a building block for artificial general intelligence (AGI). Its ability to mirror human learning by exploring and receiving feedback from the environment has made it a well-known piece of technology. The recent success of DRL agents means that they are in tough competition with human video game players. The well-known defeat of a Go grandmaster at the hands of DeepMind’s AlphaGo has really astounded people. The demonstration of bipedal agents learning to walk in a simulation has also contributed to the general sense of enthusiasm about this field.
DRL is different from supervised machine learning. When it comes to RL, the system is trained by making an agent interact with the environment. An agent that works according to the user's expectations gets positive feedback. To put it simply, researchers reinforce the agent’s correct behavior.
The following figure shows a typical reinforcement learning system:
The key challenge when it comes to applying DRL to practical problems is in the creation of a reward function. This is made so that we encourage the desired behaviors without having any undesirable side-effects. Having an incorrect reward function may result in a number of negative outcomes, including rule breaking and cheating behaviors.
Recently, there has been a peak of interest when it comes to DRL. This has led to the development of new open source toolkits and environments for training DRL systems. Many of these frameworks are special-purpose simulation tools or interfaces. Let us take a closer look at some toolkits that can be used to develop DRL applications:
OpenAI Gym: OpenAI Gym (https://gym.openai.com/) is a popular toolkit for developing and comparing RL models. Its simulator interface supports a variety of environments, including classic Atari games as well as robotics and physics simulators such as MuJoCo and the DARPA-funded Gazebo. Like other DRL toolkits, it offers APIs to feed observations and rewards back to the agents.
DeepMind Lab: Google's DeepMind Lab (https://github.com/deepmind/lab) is a 3-D learning environment based on the Quake III first-person shooter video game, which offers navigation and puzzle-solving tasks for learning agents. DeepMind recently added DMLab-30, which is a collection of new levels, and introduced its new distributed agent training architecture, Impala (https://deepmind.com/blog/impala-scalable-distributed-deeprl-dmlab-30/).
Psychlab: This is another DeepMind toolkit, open sourced this year. Psychlab (https://deepmind.com/blog/open-sourcing-psychlab/) extends DeepMind Lab capability to support cognitive psychology experiments, for example searching an array of items for a specific target or detecting changes in an array of items. It is useful for researchers to compare the performance of human and AI agents on these tasks.
House3D: This is the result of a collaboration between UC Berkeley and Facebook AI researchers. House3D (https://github.com/facebookresearch/House3D) offers over 45,000 simulated indoor scenes that have realistic room and furniture layouts. It is based on the philosophy of concept-driven navigation, which means training an agent to navigate to a specific room in a house with the help of a high-level descriptor such as Kitchen.
Recurrent neural networks (RNNs) are common NNs that are often used in natural language processing because of their promising results. There are two main approaches that can be followed to use RNNs in language models. The first is making the model predict the sentence. The second is training the model for a specific genre such that it can produce its own text.
Long short-term memory (LSTM) networks are an extended type of RNN. Memory is an important part of LSTM networks, allowing them to remember sequences. This makes them quite popular in the field of sequential datasets such as time series analysis and especially for text-based datasets.
The following figure shows a basic RNN unit:
RNNs are not sequential networks; instead, their architecture contains feedback loops. In the previous figure, you can see a single RNN unit that has three states:
: The previous state, or the first word of the sentence
: The present state, or the current word of the sentence
: The future state, or the next word in the sentence
At the same time, this unit has previous and present knowledge. On the basis of both of these states, it predicts the next outcome. W is a trainable parameter that will learn during the optimization process.
The following list shows some applications of RNNs:
Language modelling and prediction:
The similarity of a word in a sentence is considered for language modelling and prediction. The words in the next iteration are predicted using the probability of the particular time step. Memory is used to store the sequence. Where language modelling is concerned, the sequence of words from the data is considered as the input and the word
predicted
by the model will be the output. During the process of training,
, where the output of the previous time step will be the input of the present time step.
Speech recognition:
For speech recognition, the sound waves are represented as a spectrogram.
We need an ANN that is capable of dealing with time series data and that can remember sequential inputs. Two real-life products that use deep speech recognition systems are
Google Assistant and
Amazon Echo.
Machine translation:
Machine translation deals with language translation. The input of the network is the source language and the output is in the target language. It is similar to language modelling, but slightly different, as the output of machine translation starts after the complete input is fed into the network.
Image recognition and characterization:
RNNs work with CNNs to recognize an image and generate a description of it. This combination of NNs is known as multimodal learning and generates fascinating results. It is also possible to use the combined model of vision analysis and text generation to align the generated words with features found in the images. We will develop a system for image captioning later in this book.
Convolutional networks are popular because of their ability to solve complex problems. DL comes from deep structured learning and is alternatively known as hierarchical learning. People often think that deep learning and convolutional neural networks are the same thing, but there is a difference. NNs that have multiple hidden layers, normally more than two, are known as DNNS, while CNNs are specific DNN that have a different kind of neural network architecture. The following figure shows a simple neural net work with one hidden layer and a deep neural network with four hidden layers:
The following figure shows the architecture of a CNN, where different kinds of layers are present, including convolutional layers, pooling layers, and dense or fully connected layers; we will be learning more about these networks later in the book:
Why is DL so important? The most important aspect of DL is feature engineering. Features are numerical descriptions of datasets that help a classifier to understand the difference between different classes. Each class should have different features.
For example, if we want to classify two geometrical shapes, triangles and rectangles, the features that can describe the shapes are the number of edges the object has, the number of corners the object has, and the area or perimeter of the object. This is quite simple.
Now try this: how would you describe the structural difference between the surface of a brick and the surface of grass, besides the color? In this case, it is much more difficult to extract the features that can distinguish both the structures. This is where DL can come in handy.
DL has the ability to extract different levels of features from different hidden layers. For example, the first hidden layer will extract features that show high gradient changes. The next hidden layer will work on those extracted features and try to extract more abstract features from the previous layers. This architecture can be represented as a hierarchy in which the feature extraction in the current layer will depend on the output of the previous layers. This is shown in the following figure:
As you can see, each layer is responsible for extracting different kinds of facial features. This is how a CNN works. We will implement these kinds of networks later in the book, where you will get more information regarding how these work. For now, let's look at some applications of DL.
Owing to how interesting a concept it is, image recognition has also caught the eye of many. In this case, images are represented as pixels in the form of a 2-D array. Each pixel, either with RGB-channels or in grayscale, is fed directly into a CNN and then trained. The CNN is made up of alternating layers of convolutional, pooling, and subsampling layers. The result is a deep abstract representation of images at the output layer. This makes CNN a powerful tool when it comes to classifying the contents of images. These networks are behind the success of the following applications:
Google Photos
: Google Photos is powered by a large scale CNN that lies in the Cloud. It is somewhat similar to Inception (
https://cloud.google.com/tpu/docs/inception-v3-advanced
). It is run on extremely powerful Google servers that have tensor processing units (TPUs). Google Photos basically scans and tags backed-up images in the cloud. This process is automated, making these images easily accessible and searchable.
Microsoft How-Old
:
The Microsoft How-Old (
https://www.how-old.net/
) a
pplication tries to determine how old someone is based on their photos.
It may not be very accurate but even humans find this difficult
.
Clarifai
: A cloud-based image recognition service, Clarifai (
https://clarifai.com/
) is also well known.
NLP involves two main challenges:
Natural language understanding (NLU): NLU differs from speech recognition in a way because it is not just about mapping speech into words, it leans more towards extracting some meaning from spoken or written words. We want NLU systems to extract meaning from any given conversation or from literature, just as a real human would.
Natural language generation (NLG): NLG is about generating language so that the system can respond to a spoken or written input in a respectable, understandable, and contextually correct manner to another intelligent agent or human. However, to respond, the system needs to go through the knowledge it learned during its training phase in an effective and efficient way. It should then be able to formulate a response. The actual robot voice can then be generated by different generative models. Google's DeepMind project demonstrated these tasks with the help of WaveNet.
With recent developments, autonomous driving is closer to reality than ever. It is no longer a futuristic dream. So many new companies have announced their dedication and commitment to develop and launch high-tech autonomous vehicles.
Autonomous driving is not very common right now and may seem scary to some, but it’s hard to deny the benefits of a driverless vehicle. For starters, the roads would be congestion-free, there would be a reduction of emissions due to efficient driving, more efficient parking, lower costs of transportation, and reduced costs of building new roads and infrastructure. It would also be very helpful for elderly and disabled people.
Have you noticed that most of the systems discussed so far mostly depend on mathematical optimizations? These systems learn from labelled datasets. To call a system intelligent, however, it must learn from the environment like humans do. Reinforcement learning is close to the true concept of AI, but most other technologies rely only on supervised learning. NNs roughly mimic the working of the human brain to make machines learn from different examples. DL has helped many technology companies such as Google and Apple to improve their products economically by implementing new features such as face recognition, image understanding, language understanding, and so on. Despite its name, however, DL is not real intelligence; it is stilla supervised learning algorithm. The field of ML that requires huge datasets to learn either to classify objects or to make predictions is also based on supervised learning methods. These systems have the following limitations:
The thinking of a supervised ML system is always limited to a specific domain only.
The intelligence of this kind of system depends on the training dataset you have used. So, you are the controller, not the machine.
These systems cannot be used in environments that change dynamically.
This method can be used only for either classification tasks or regression. It is not possible to use them in control system problems.
Obtaining the huge datasets that these methods require is a major problem.
