Artificial Intelligence, Machine Learning, and Deep Learning - Oswald Campesato - E-Book

Artificial Intelligence, Machine Learning, and Deep Learning E-Book

Oswald Campesato

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

This book introduces AI, then explores machine learning, deep learning, natural language processing (NLP), and reinforcement learning. Readers learn about classifiers like logistic regression, k-NN, decision trees, random forests, and SVMs. It delves into deep learning architectures such as CNNs, RNNs, LSTMs, and autoencoders, with Keras-based code samples supplementing the theory.
Starting with a foundational AI overview, the course progresses into machine learning, explaining classifiers and their applications. It continues with deep learning, focusing on architectures like CNNs and RNNs. Advanced topics include LSTMs and autoencoders, essential for modern AI. The book also covers NLP and reinforcement learning, emphasizing their importance.
Understanding these concepts is vital for developing advanced AI systems. This book transitions you from beginner to proficient AI practitioner, combining theoretical knowledge and practical skills. Appendices on Keras, TensorFlow 2, and Pandas enrich the learning experience. By the end, readers will understand AI principles and be ready to apply them in real-world scenarios.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 375

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



ARTIFICIAL INTELLIGENCE MACHINE LEARNING AND DEEP LEARNING

Oswald Campesato

MERCURY LEARNING AND INFORMATION

Dulles, Virginia

Boston, Massachusetts

New Delhi

 

Copyright © 2020 by MERCURY LEARNING AND INFORMATION LLC.

All rights reserved.

This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.

Publisher: David Pallai

MERCURY LEARNING AND INFORMATION

22841 Quicksilver Drive

Dulles, VA 20166

[email protected]

www.merclearning.com

1-800-232-0223

O. Campesato. Artificial Intelligence, Machine Learning and Deep Learning.

ISBN: 978-1-68392-467-8

The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.

Library of Congress Control Number: 2019957226

202122321    Printed on acid-free paper in the United States of America.

Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 800-232-0223(toll free).

All of our titles are available in digital format at authorcloudware.com and other digital vendors. The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the book, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.

I’d like to dedicate this book to my parents – may this bring joy and happiness into their lives.

CONTENTS

Preface

Chapter 1: Introduction to AI

What is Artificial Intelligence?

Strong AI versus Weak AI

The Turing Test

Definition of the Turing Test

An Interrogator Test

Heuristics

Genetic Algorithms

Knowledge Representation

Logic-based Solutions

Semantic Networks

AI and Games

The Success of AlphaZero

Expert Systems

Neural Computing

Evolutionary Computation

Natural Language Processing

Bioinformatics

Major Parts of AI

Machine Learning

Deep Learning

Reinforcement Learning

Robotics

Code Samples

Summary

Chapter 2: Introduction to Machine Learning

What is Machine Learning?

Types of Machine Learning

Types of Machine Learning Algorithms

Machine Learning Tasks

Feature Engineering, Selection, and Extraction

Dimensionality Reduction

PCA

Covariance Matrix

Working with Datasets

Training Data Versus Test Data

What Is Cross-validation?

What Is Regularization?

ML and Feature Scaling

Data Normalization vs Standardization

The Bias-Variance Tradeoff

Metrics for Measuring Models

Limitations of R-Squared

Confusion Matrix

Accuracy vs Precision vs Recall

The ROC Curve

Other Useful Statistical Terms

What Is an F1 Score?

What Is a p-value?

What Is Linear Regression?

Linear Regression vs Curve-Fitting

When Are Solutions Exact Values?

What Is Multivariate Analysis?

Other Types of Regression

Working with Lines in the Plane (optional)

Scatter Plots with NumPy and Matplotlib (1)

Why the “Perturbation Technique” Is Useful

Scatter Plots with NumPy and Matplotlib (2)

A Quadratic Scatterplot with NumPy and Matplotlib

The Mean Squared Error (MSE) Formula

A List of Error Types

Non-linear Least Squares

Calculating the MSE Manually

Approximating Linear Data with np.linspace()

Calculating MSE with np.linspace() API

Linear Regression with Keras

Summary

Chapter 3: Classifiers in Machine Learning

What Is Classification?

What Are Classifiers?

Common Classifiers

Binary vs MultiClass Classification

MultiLabel Classification

What Are Linear Classifiers?

What Is kNN?

How to Handle a Tie in kNN

What Are Decision Trees?

What Are Random Forests?

What Are SVMs?

Tradeoffs of SVMs

What Is Bayesian Inference?

Bayes Theorem

Some Bayesian Terminology

What Is MAP?

Why Use Bayes’ Theorem?

What Is a Bayesian Classifier?

Types of Naïve Bayes Classifiers

Training Classifiers

Evaluating Classifiers

What Are Activation Functions?

Why do We Need Activation Functions?

How Do Activation Functions Work?

Common Activation Functions

Activation Functions in Python

Keras Activation Functions

The ReLU and ELU Activation Functions

The Advantages and Disadvantages of ReLU

ELU

Sigmoid, Softmax, and Hardmax Similarities

Softmax

Softplus

Tanh

Sigmoid, Softmax, and HardMax Differences

What Is Logistic Regression?

Setting a Threshold Value

Logistic Regression: Important Assumptions

Linearly Separable Data

Keras, Logistic Regression, and Iris Dataset

Summary

Chapter 4: Deep Learning Introduction

Keras and the XOR Function

What Is Deep Learning?

What Are Hyper Parameters?

Deep Learning Architectures

Problems that Deep Learning Can Solve

Challenges in Deep Learning

What Are Perceptrons?

Definition of the Perceptron Function

A Detailed View of a Perceptron

The Anatomy of an Artificial Neural Network (ANN)

Initializing Hyperparameters of a Model

The Activation Hyperparameter

The Loss Function Hyperparameter

The Optimizer Hyperparameter

The Learning Rate Hyperparameter

The Dropout Rate Hyperparameter

What Is Backward Error Propagation?

What Is a Multilayer Perceptron (MLP)?

Activation Functions

How Are Datapoints Correctly Classified?

A High-Level View of CNNs

A Minimalistic CNN

The Convolutional Layer (Conv2D)

The ReLU Activation Function

The Max Pooling Layer

Displaying an Image in the MNIST Dataset

Keras and the MNIST Dataset

Keras, CNNs, and the MNIST Dataset

Analyzing Audio Signals with CNNs

Summary

Chapter 5: Deep Learning: RNNs and LSTMs

What Is an RNN?

Anatomy of an RNN

What Is BPTT?

Working with RNNs and Keras

Working with Keras, RNNs, and MNIST

Working with TensorFlow and RNNs (Optional)

What Is an LSTM?

Anatomy of an LSTM

Bidirectional LSTMs

LSTM Formulas

LSTM Hyperparameter Tuning

Working with TensorFlow and LSTMs (Optional)

What Are GRUs?

What Are Autoencoders?

Autoencoders and PCA

What Are Variational Autoencoders?

What Are GANs?

Can Adversarial Attacks Be Stopped?

Creating a GAN

A High-Level View of GANs

The VAE-GAN Model

Summary

Chapter 6: NLP and Reinforcement Learning

Working with NLP (Natural Language Processing)

NLP Techniques

The Transformer Architecture and NLP

Transformer-XL Architecture

Reformer Architecture

NLP and Deep Learning

Data Preprocessing Tasks in NLP

Popular NLP Algorithms

What Is an n-gram?

What Is a skip-gram?

What Is BoW?

What Is Term Frequency?

What Is Inverse Document Frequency (idf)?

What Is tf-idf?

What Are Word Embeddings?

ELMo, ULMFit, OpenAI, BERT, and ERNIE 2.0

What Is Translatotron?

Deep Learning and NLP

NLU versus NLG

What Is Reinforcement Learning (RL)?

Reinforcement Learning Applications

NLP and Reinforcement Learning

Values, Policies, and Models in RL

From NFAs to MDPs

What Are NFAs?

What Are Markov Chains?

Markov Decision Processes (MDPs)

The Epsilon-Greedy Algorithm

The Bellman Equation

Other Important Concepts in RL

RL Toolkits and Frameworks

TF-Agents

What Is Deep Reinforcement Learning (DRL)?

Summary

Appendix A: Introduction to Keras

What Is Keras?

Working with Keras Namespaces in TF 2

Working with the tf.keras.layers Namespace

Working with the tf.keras.activations Namespace

Working with the keras.tf.datasets Namespace

Working with the tf.keras.experimental Namespace

Working with Other tf.keras Namespaces

TF 2 Keras versus “Standalone” Keras

Creating a Keras-based Model

Keras and Linear Regression

Keras, MLPs, and MNIST

Keras, CNNs, and cifar10

Resizing Images in Keras

Keras and Early Stopping (1)

Keras and Early Stopping (2)

Keras and Metrics

Saving and Restoring Keras Models

Summary

Appendix B: Introduction to TF 2

What Is TF 2?

TF 2 Use Cases

TF 2 Architecture: The Short Version

TF 2 Installation

TF 2 and the Python REPL

Other TF 2-based Toolkits

TF 2 Eager Execution

TF 2 Tensors, Data Types, and Primitive Types

TF 2 Data Types

TF 2 Primitive Types

Constants in TF 2

Variables in TF 2

The tf.rank() API

The tf.shape() API

Variables in TF 2 (Revisited)

TF 2 Variables vs Tensors

What Is @tf.function in TF 2?

How Does @tf.function Work?

A Caveat About @tf.function in TF 2

The tf.print() Function and Standard Error

Working with @tf.function in TF 2

An Example Without @tf.function

An Example With @tf.function

Overloading Functions with @tf.function

What Is AutoGraph in TF 2?

Arithmetic Operations in TF 2

Caveats for Arithmetic Operations in TF 2

TF 2 and Built-in Functions

Calculating Trigonometric Values in TF 2

Calculating Exponential Values in TF 2

Working with Strings in TF 2

Working with Tensors and Operations in TF 2

Second-Order Tensors in TF 2 (1)

2nd Order Tensors in TF 2 (2)

Multiplying Two Second-Order Tensors in TF 2

Convert Python Arrays to TF Tensors

Conflicting Types in TF 2

Differentiation and tf.GradientTape in TF 2

Examples of tf.GradientTape

Using the watch() Method of tf.GradientTape

Using Nested Loops with tf.GradientTape

Other Tensors with tf.GradientTape

A Persistent Gradient Tape

Google Colaboratory

Other Cloud Platforms

GCP SDK

Summary

Appendix C: Introduction to Pandas

What Is Pandas?

Pandas Dataframes

Dataframes and Data Cleaning Tasks

A Labeled Pandas Dataframe

Pandas Numeric DataFrames

Pandas Boolean DataFrames

Transposing a Pandas Dataframe

Pandas Dataframes and Random Numbers

Combining Pandas DataFrames (1)

Combining Pandas DataFrames (2)

Data Manipulation with Pandas Dataframes (1)

Data Manipulation with Pandas DataFrames (2)

Data Manipulation with Pandas Dataframes (3)

Pandas DataFrames and CSV Files

Pandas DataFrames and Excel Spreadsheets (1)

Pandas DataFrames and Excel Spreadsheets (2)

Reading Data Files with Different Delimiters

Transforming Data with the sed Command (Optional)

Select, Add, and Delete Columns in DataFrames

Pandas DataFrames and Scatterplots

Pandas DataFrames and Histograms

Pandas DataFrames and Simple Statistics

Standardizing Pandas DataFrames

Pandas DataFrames, NumPy Functions, and Large Datasets

Working with Pandas Series

From ndarray

Pandas DataFrame from Series

Useful One-line Commands in Pandas

What Is Jupyter?

Jupyter Features

Launching Jupyter from the Command Line

JupyterLab

Develop JupyterLab Extensions

Summary

Index

PREFACE: THE ML AND DL LANDSCAPE

What Is the Goal?

The goal of this book is to introduce advanced beginners to basic machine learning and deep learning concepts and algorithms. It is intended to be a fast-paced introduction to various “core” features of machine learning and deep learning, with code samples that are included in a university course. The material in the chapters illustrates how to solve some tasks using Keras, after which you can do further reading to deepen your knowledge.

This book will also save you the time required to search for code samples, which is a potentially time-consuming process. In any case, if you’re not sure whether or not you can absorb the material presented here, then glance through the code samples to get a feel for the level of complexity.

At the risk of stating the obvious, please keep in mind the following point: you will not become an expert in machine learning or deep learning by reading this book.

What Will I Learn from This Book?

The first chapter contains a very short introduction to AI, followed by a chapter devoted to Pandas for managing the contents of datasets. The third chapter introduces you to machine learning concepts (supervised and unsupervised learning), types of tasks (regression, classification, and clustering), and linear regression (the second half of the chapter). The fourth chapter is devoted to classification algorithms, such as kNN, Naïve Bayes, decision trees, random forests, and SVM (Support Vector Machines).

The fifth chapter introduces deep learning and delves into CNNs (Convolutional Neural Networks). The sixth chapter covers deep learning architectures such as RNNs (recurrent neural networks) and LSTMs (Long Short Term Memory).

The sixth chapter introduces you to aspects of NLP (Natural Language Processing, with some basic concepts and algorithms, followed by RL (Reinforcement Learning) and the Bellman equation. The first appendix covers Keras, whereas the second appendix covers TensorFlow 2.0.

Another point: although Jupyter is popular, all the code samples in this book are Python scripts. However, you can quickly learn the useful features of Jupyter through various online tutorials. In addition, it’s worth looking at Google Colaboratory that is entirely online and is based on Jupyter notebooks, along with free GPU usage.

How Much Keras Knowledge Is Needed for this Book?

Some exposure to Keras is helpful, and you can read the appendix if Keras is new to you. If you also want to learn about Keras and logistic regression, there is an example in Chapter 3. This example requires some theoretical knowledge involving activation functions, optimizers, and cost functions, all of which are discussed in Chapter 4.

Please keep in mind that Keras is well-integrated into TensorFlow 2 (in the tf.keras namespace), and it provides a layer of abstraction over “pure” TensorFlow that will enable you to develop prototypes more quickly.

Do I Need to Learn the Theory Portions of this Book?

Once again, the answer depends on the extent to which you plan to become involved in machine learning. In addition to creating a model, you will use various algorithms to see which ones provide the level of accuracy (or some other metric) that you need for your project. If you fall short, the theoretical aspects of machine learning can help you perform a “forensic” analysis of your model and your data, and ideally assist in determining how to improve your model.

How Were the Code Samples Created?

The code samples in this book were created and tested using Python 3 and Keras that’s built into TensorFlow 2 on a MacBook Pro with OS X 10.12.6 (MacOS Sierra). Regarding their content: the code samples are derived primarily from the author for his Deep Learning and Keras graduate course. In some cases there are code samples that incorporate short sections of code from discussions in online forums. The key point to remember is that the code samples follow the “Four Cs”: they must be Clear, Concise, Complete, and Correct to the extent that it’s possible to do so, given the size of this book.

What Are the Technical Prerequisites for This Book?

You need some familiarity with Python, and also know how to launch Python code from the command line (in a Unix-like environment for Mac users). In addition, a mixture of basic linear algebra (vectors and matrices), probability/statistics, (mean, median, standard deviation) and basic concepts in calculus (such as derivatives) will help you master the material. Some knowledge of NumPy and Matplotlib is also helpful, and the assumption is that you are familiar with basic functionality (such as NumPy arrays).

One other prerequisite is important for understanding the code samples in the second half of this book: some familiarity with neural networks, which includes the concept of hidden layers and activation functions (even if you don’t fully understand them). Knowledge of cross entropy is also helpful for some of the code samples.

What Are the Non-technical Prerequisites for This Book?

Although the answer to this question is more difficult to quantify, it’s very important to have a strong desire to learn about machine learning, along with the motivation and discipline to read and understand the code samples.

Even simple machine language APIs can be a challenge to understand them at first encounter, so be prepared to read the code samples several times.

How Do I Set up a Command Shell?

If you are a Mac user, there are three ways to do so. The first method is to use Finder to navigate to Applications > Utilities and then double click on the Utilities application. Next, if you already have a command shell available, you can launch a new command shell by typing the following command:

open/Applications/Utilities/Terminal.app

A second method for Mac users is to open a new command shell on a MacBook from a command shell that is already visible simply by clicking command+n in that command shell, and your Mac will launch another command shell.

If you are a PC user, you can install Cygwin (open source https://cygwin.com/) that simulates bash commands, or use another toolkit such as MKS (a commercial product). Please read the online documentation that describes the download and installation process. Note that custom aliases are not automatically set if they are defined in a file other than the main start-up file (such as .bash_login).

Companion Files

All of the code samples and figures in this book may be obtained for download by writing to the publisher at [email protected].

What Are the “Next Steps” after Finishing this Book?

The answer to this question varies widely, mainly because the answer depends heavily on your objectives. The best answer is to try a new tool or technique from the book out on a problem or task you care about, professionally or personally. Precisely what that might be depends on who you are, as the needs of a data scientist, manager, student or developer are all different. In addition, keep what you learned in mind as you tackle new challenges.

O. CampesatoSan Francisco, CA

CHAPTER 1

INTRODUCTION TO AI

This chapter provides a gentle introduction to AI, primarily as a broad overview of this diverse topic. Unlike the other chapters in this book, this introductory chapter is chapter is “light” in terms of technical content. However, it’s easy to read and also worth skimming through its contents. Machine learning and deep learning are briefly introduced toward the end of this chapter, both of which are discussed in more detail in subsequent chapters.

Keep in mind that many AI-focused books tend to discuss AI from the perspective of computer science and a discussion of traditional algorithms and data structures. By contrast, this book treats AI as an “umbrella” for machine learning and deep learning, and therefore it’s discussed in a cursory manner as a precursor to the other chapters.

The first part of this chapter starts with a discussion regarding the term artificial intelligence, various potential ways to determine the presence of intelligence, as well as the difference between Strong AI and Weak AI. You will also learn about the Turing Test, which is a well-known test for intelligence.

The second part of this chapter discusses some AI uses-cases and the early approaches to neural computing, evolutionary computation, NLP, and bioinformatics.

The third part of this chapter introduces you to major subfields of AI, which include natural language processing (with NLU and NLG), machine learning, deep learning, reinforcement learning, and deep reinforcement learning.

Although code-specific samples are not discussed in this chapter, the companion files for this chapter do contain a Java-based code sample for solving the Red Donkey problem, and also a Python-based code sample (that requires Python 2.x) for solving Rubik’s Cube.

What Is Artificial Intelligence?

The literal meaning of the word artificial is synthetic, which often has a negative connotation of being an inferior substitute. However, artificial objects (e.g., flowers) can closely approximate their counterparts, and sometimes they can be advantageous when they do not have any maintenance requirements (sunshine, water, and so forth).

By contrast, a definition for intelligence is more elusive than a definition of the word artificial. R. Sternberg, in a text on human consciousness, provides the following useful definition: “Intelligence is the cognitive ability of an individual to learn from experience, to reason well, to remember important information, and to cope with the demands of daily living.”

You probably remember standardized tests with questions that ask for the next number in a given sequence, such as 1, 3, 6, 10, 15, 21. The first thing to observe is that the gap between successive numbers increases by one: from 1 to 3, the increase is two, whereas from 3 to 6, it is three, and so on. Based on this pattern, the plausible response is 28. Such questions are designed to measure our proficiency at identifying salient features in patterns.

Incidentally, there can be multiple answers to a “next-in-sequence” numeric problem. For example, the sequence 2, 4, 8 might suggest 16 as the next number in this sequence, which is correct if the generating formula is 2^n. However, if the generating formula is 2^n + (n-1)*(n-2)*(n-3), then the next number in the sequence is 22 (not 16). There are many formulas that can match 2, 4, and 8 as the initial sequence of numbers, and yet the next number can be different from 16 or 22.

Let’s return to R. Sternberg’s definition for intelligence, and consider the following questions:

How do you decide if someone (something?) is intelligent?

Are animals intelligent?

If animals are intelligent, how do you measure their intelligence?

We tend to assess people’s intelligence through interaction with them: we ask questions and observe their answers. Although this method is indirect, we often rely on this method to gauge other people’s intelligence.

In the case of animal intelligence, we also observe their behavior to make an assessment. Clever Hans was a famous horse that lived in Berlin, Germany, circa 1900, and allegedly had a proficiency in arithmetic, such as adding numbers and calculating square roots.

In reality, Hans was able to identify human emotions and, in conjunction with his astute hearing, he could sense the reaction of audience members as Hans came closer to a correct answer. Interestingly, Hans performed poorly without the presence of an audience. You might be reluctant to attribute Clever Hans’s actions to intelligence; however, review Sternberg’s definition before reaching a conclusion.

As another example, some creatures exhibit intelligence only in groups. Although ants are simple insects, and their isolated behavior would hardly warrant inclusion in a text on AI, ant colonies exhibit extraordinary solutions to complex problems. In fact, ants can figure out the optimal route from a nest to a food source, how to carry heavy objects, and how to form bridges. Thus, a collective intelligence arises from effective communication among individual insects.

The ratios of brain mass and brain-to-body mass are indicators of intelligence, and dolphins compare favorably with humans in both metrics. Breathing in dolphins is under voluntary control, which could account for excess brain mass, as well as the fact that alternate halves of a dolphin’s brain take turns sleeping. Dolphins score well on animal self-awareness tests such as the mirror test, in which they recognize that the image in the mirror is actually their own image. They can also perform complex tricks, as visitors to Sea World can testify. This illustrates the ability of dolphins to remember and perform complex sequences of physical motions.

The use of tools is another litmus test for intelligence and is often used to separate homo erectus from earlier ancestors of human beings. Dolphins also share this trait with humans: dolphins use deep-sea sponges to protect their spouts while foraging for food. Thus, intelligence is not an attribute possessed by humans alone. Many living forms possess some degree of intelligence.

Now consider the following question: can inanimate objects, such as computers, possess intelligence? The declared goal of artificial Intelligence is to create computer software and/or hardware systems that exhibit thinking comparable to that of humans, in other words, to display characteristics usually associated with human intelligence.

What about the capacity to think, and can machines think? Keep in mind the distinction between thinking and intelligence. Thinking is the facility to reason, analyze, evaluate, and formulate ideas and concepts. Therefore, not every being capable of thinking is intelligent. Intelligence is perhaps akin to efficient and effective thinking.

Many people approach this issue with biases, saying that computers are made of silicon and power supplies and therefore are not capable of thinking. At the other extreme, computers perform much faster than humans and therefore must be more intelligent than humans. The truth is most likely somewhere between these two extremes. As we have discussed, different animal species possess intelligence to varying degrees. However, we are more interested in a test to ascertain the existence of machine intelligence than in developing standardized IQ tests for animals. Perhaps Raphael put it best: artificial intelligence is the science of making machines do things that would require intelligence if done by man.

Strong AI versus Weak AI

Currently there are two main camp regarding AI. The weak AI approach is associated with the Massachusetts Institute of Technology, and it views any system that exhibits intelligent behavior as an example of AI. This camp focuses on whether a program performs correctly, regardless of whether the artifact performs its task in the same way humans do. The results of AI projects in electrical engineering, robotics, and related fields are primarily concerned with satisfactory performance.

The other approach to AI is called biological plausibility, and it’s associated with Carnegie-Mellon University. According to this approach, when an artifact exhibits intelligent behavior, its performance should be based upon the same methodologies used by humans. For instance, consider a system capable of hearing: proponents of strong AI might aim to achieve success by simulating the human hearing system, whereas weak AI proponents would be concerned merely with the system’s performance. This simulation would include the equivalents to cochlea, hearing canal, eardrum, and other parts of the ear, each performing its required tasks in the system.

Hence, proponents of weak AI measure the success of the systems that they build based on their performance alone. They maintain that the raison d’etre of AI research is to solve difficult problems regardless of how they are actually solved.

On the other hand, proponents of strong AI are concerned with the structure of the systems they build. They maintain that by sheer dint of possessing heuristics, algorithms, and knowledge of AI programs, computers can possess a sense of consciousness and intelligence. As you know, Hollywood has produced various movies (e.g., I, Robot and Blade Runner) that belong to the strong AI camp.

The Turing Test

The previous section posed three questions, and the first two questions have already been addressed: how do you determine intelligence, and are animals intelligent? The answer to the second question is not necessarily yes or no. Some people are smarter than others and some animals are smarter than others. The question of machine intelligence is equally problematic.

Alan Turing sought to answer the question of intelligence in operational terms. He wanted to separate functionality (what something does) from implementation (how something is built). He devised something that’s called the Turing Test, which is discussed in the next section.

Definition of the Turing Test

Alan Turing proposed two imitation games, in which one person or entity behaves as if he were another. In the first game, a person (called an interrogator) is in a room with a curtain that runs across the center of the room. On the other side of the curtain is a person, and the interrogator must determine whether it is a man or a woman. The interrogator (whose gender is irrelevant) accomplishes this task by asking a series of questions.

This game assumes that the man will perhaps lie in his responses, but the woman is always truthful. In order that the interrogator cannot determine gender from voice, communication is via computer rather than through spoken words. If it is a man on the other side of the curtain, and he is successful in deceiving the interrogator, then he wins the imitation game.

In Turing’s original format for this test, both a man and a woman were seated behind a curtain and the interrogator had to identify both correctly. Turing might have based this test on a game that was popular during this period, which may even have been the impetus behind his machine intelligence test.

Additional interesting updates regarding the Turing test are discussed in these two links:

https://futurism.com/the-byte/scientists-invented-new-turing-test

https://theconversation.com/our-turing-test-for-androids-will-judge-how-lifelike-humanoid-robots-can-be-120696

In case you didn’t already know, Erich Fromm was a well-known sociologist and psychoanalyst in the twentieth century who believed that men and women are equal but not necessarily the same. For instance, the genders might differ in their knowledge of colors, flowers, or the amount of time spent shopping. What does distinguishing a man from a woman have to do with the question of intelligence? Turing understood that there might be different types of thinking, and it is important to both understand these differences and to be tolerant of them.

An Interrogator Test

This second game is more appropriate to the study of AI. Once again, an interrogator is in a room with a curtain. This time, a computer or a person is behind the curtain, and the machine plays the role of the male and could also find it convenient on occasion to lie.

The person, on the other hand, is consistently truthful. The interrogator asks questions and then evaluates the responses to determine whether she is communicating with a person or a machine. If the computer is successful in deceiving the interrogator, it passes the Turing Test and is thereby considered intelligent.

Heuristics

Heuristics can be very useful, and AI applications often rely on the application of heuristics. A heuristic is essentially a “rule of thumb” for solving a problem. In other words, a heuristic is a set of guidelines that often works to solve a problem. Contrast a heuristic with an algorithm, which is a prescribed set of rules to solve a problem and whose output is entirely predictable.

A heuristic is a technique for finding an approximate solution that can be used when other methods are too time-consuming or too complex (or both). With a heuristic, a favorable outcome is likely but not guaranteed, and heuristic methods were especially popular in the early days of AI.

Various heuristics appear in daily life. For example, many people prefer using heuristics instead of asking for driving directions. For instance, when exiting a highway at night, sometimes it’s difficult to find the route back to the main thoroughfare. One heuristic that could prove helpful is to proceed in the direction with more streetlights whenever they come to a fork in the road. You might have a favorite ploy for recovering a dropped contact lens or for finding a parking space in a crowded shopping mall. Both are examples of heuristics.

AI problems tend to be large and computationally complex, and frequently they cannot be solved via straightforward algorithms. AI problems and their domains tend to embody a large amount of human expertise, especially if tackled by strong AI methods. Some types of problems are better solved using AI, whereas others are more suitable for traditional computer science approaches involving simple decision-making or exact computations to produce solutions. Let us consider a few examples:

Medical diagnosis

Shopping using a cash register with barcode scanning

ATMs

Two-person games such as chess and checkers

Medical diagnosis is a field of science that has benefited for many years from AI-based contributions, particularly through the development of expert systems. Expert systems are typically built in domains where there is considerable human expertise and where there exist many rules that are often of the form: if-condition-then-action. As a trivial example: if you have a headache, then take two aspirins and call me in the morning.

In particular, expert systems became very popular (and very useful) because they can store far more rules than humans can hold in their head. Expert systems are among the most successful AI techniques for producing results that are comprehensive and effective. In fact, expert systems can help humans make more accurate decisions (and even “challenge” incorrect choices).

Genetic Algorithms

One promising paradigm is Darwin’s theory of evolution, which involves natural selection that occurs in nature at a rate of thousands or millions of years. By contrast, evolution inside a computer proceeds much faster than natural selection.

A genetic algorithm is a heuristic that “mimics” the process of natural selection, which involves selecting the fittest individuals for reproduction to sire the offspring of the subsequent generation.

Let’s compare and contrast the use of AI with the process of evolution in the plant and animal world, in which species adapt to their environments through the genetic operators of natural selection, reproduction, mutation, and recombination.

Genetic algorithms (GA) are a specific methodology from the general field known as evolutionary computation, which is that branch of AI wherein proposed solutions to a problem adapt much as animal creatures adapt to their environments in the real world.

In case you’re interested, the following link contains some interesting details regarding genetic algorithms:

https://en.wikipedia.org/wiki/Genetic_algorithm

Knowledge Representation

The issue of representation becomes important when we consider AI-related problems. AI systems that acquire and store knowledge in order to process it and produce intelligent results also need the ability to identify and represent that knowledge. The choice of a representation is intrinsic to the nature of problem solving and understanding.

As George Polya (a famous mathematician) remarked, a good representation choice is almost as important as the algorithm or solution plan devised for a particular problem. Good and natural representations facilitate fast and comprehensible solutions.

As an example of a representation choice, consider the well-known Missionaries and Cannibals Problem, where the goal is to transfer three missionaries and three cannibals from the west bank to the east bank of a river with a boat. At any point during the transitions from west to east, you can see the solution path by selecting an appropriate representation. There are two constraints in this problem: the boat can hold no more than two people at any time and the cannibals on any bank can never outnumber the number of missionaries.

A solution for this problem (as well as the related “jealous husbands” problem) is here:

https://en.wikipedia.org/wiki/Missionaries_and_cannibals_problem#targetText=The%20missionaries%20and%20cannibals%20problem,an%20example%20of%20problem%20representation

Logic-based Solutions

AI researchers have used a logic-based approach for knowledge representation and problem-solving technique. A seminal example of using logic for this purpose is Terry Winograd’s Blocks World (1972), in which a robot arm interacts with blocks on a tabletop. This program encompassed issues of language understanding and scene analysis as well as other aspects of AI.

In addition, production rules and production systems are used to construct many successful expert systems. The appeal of production rules and expert systems is based on the feasibility of representing heuristics clearly and concisely. Thousands of expert systems have been built incorporating this methodology.

Semantic Networks

Semantic networks are another graphical, though complex, representation of knowledge. Semantic networks precede object-oriented languages, which use inheritance (wherein an object from a particular class inherits many of the properties of a superclass).

Much of the work employing semantic networks has focused on representing the knowledge and structure of language. Examples include Stuart Shapiro SNePS (Semantic Net Processing System) and the work of Roger Schank in natural language processing.

Additional alternatives exist for knowledge representation: graphical approaches offer greater appeal to the senses, such as vision, space, and motion. Possibly the earliest graphical approaches were state-space representations, which display all the possible states of a system.

AI and Games

Since the middle of the twentieth century and the advent of computers, significant progress in computer science and proficiency in programming techniques was acquired through the challenges of training computers to play and master complex board games. Some examples of games whose play by computer have benefitted from the application of AI insights and methodologies have included chess, checkers, Go, and Othello.

Games have spurred the development and interest in AI. Early efforts were highlighted by the efforts of Arthur Samuel in 1959 on the game of checkers. His program was based on tables of fifty heuristics and was used to play against different versions of itself. The losing program in a series of matches would adopt the heuristics of the winning program. It played strong checkers, but never mastered the game.

People have been trying to train machines to play strong chess for several centuries. The infatuation with chess machines probably stems from the generally accepted view that it requires intelligence to play chess well.

In 1959, Newell, Simon, and Shaw developed the first real chess program, which followed the Shannon-Turing Paradigm. Richard Greenblatt’s program was the first to play club-level chess. Computer chess programs improved steadily in the 1970s until, by the end of that decade, they reached the Expert level (equivalent to the top 1% of chess tournament players).

In 1983, Ken Thompson’s Belle was the first program to officially achieve the Master level. This was followed by the success of Hitech, from Carnegie-Mellon University, which successfully accomplished a major milestone as the first Senior Master (over 2400-rated) program. Shortly thereafter the program Deep Thought (also from Carnegie-Mellon) was developed and became the first program capable of beating Grandmasters on a regular basis.

Deep Thought evolved into Deep Blue when IBM took over the project in the 1990s, and Deep Blue played a six-game match with World Champion Garry Kasparov, who saved mankind by winning a match in Philadelphia in 1996. In 1997, however, against Deeper Blue, the successor of Deep Blue, Kasparov lost, and the chess world was shaken.

In subsequent six-game matches against Kasparov, Kramnik, and other World Championship-level players, programs have fared well, but these were not World Championship Matches. Although it is generally agreed that these programs might still be slightly inferior to the best human players, most would be willing to concede that top programs play chess indistinguishably from the most accomplished humans (if one is thinking of the Turing Test).

In 1989, Jonathan Schaeffer, at the University of Alberta in Edmonton, began his long-term goal of conquering the game of checkers with his program Chinook. In a forty-game match in 1992 against longtime Checkers World Champion Marion Tinsley, Chinook lost four‚ with thirty-four draws. In 1994 their match was tied after six games, when Tinsley had to forfeit because of health reasons. Since that time, Schaeffer and his team have been working to solve checkers from both the end of the game (all eight-pieces and fewer endings) as well as from the beginning.

Other games that use AI techniques include backgammon, poker, bridge, Othello, and Go (often called the new drosophila).

The Success of AlphaZero

Google created AlphaZero, which is an AI-based software program that used self-play to learn how to play games. AlphaZero is the successor to Alpha Go that defeated the world’s best human Go player in 2016. AlphaZero easily defeated Alpha Go in the game of Go.

Moreover, after learning the rules of chess, AlphaZero trained itself (again using self-play) and within a single day became the top chess player in the world. AlphaZero can defeat any human chess player as well as any chess-playing computer program.

The really interesting point is that AlphaZero developed its own strategy for playing chess, which not only differs from humans, but also involves chess moves that are considered counterintuitive.

Unfortunately, AlphaZero is unable to tell us how it developed a strategy that is superior to any previously developed approach for playing chess. Since AlphaZero is 100% self-taught and is the top-ranked chess player in the world, does AlphaZero qualify as intelligent?

Expert Systems

Expert systems are one of the areas that have been investigated for almost as long as AI itself has existed. It is one discipline that AI can claim as a great success. Expert systems have many characteristics that make them desirable for AI research and development. These include separation of the knowledge base from the inference engine, being more than the sum of any or all of their experts, relationship of knowledge to search techniques, reasoning, and uncertainty.

One of the earliest and most often referenced systems was heuristic DENDRAL. Its purpose was to identify unknown chemical compounds on the basis of their mass spectrographs. DENDRAL was developed at Stanford University with the goal of performing a chemical analysis of the Martian soil. It was one of the first systems to illustrate the feasibility of encoding domain-expert knowledge in a particular discipline.

Perhaps the most famous expert system is MYCIN, also from Stanford University (1984). Mycin was developed to facilitate the investigation of infectious blood diseases. Even more important than its domain, however, was the example that Mycin established for the design of all subsequent knowledge-based systems. It had over 400 rules, which were eventually used to provide a training dialogue for residents at the Stanford hospital.

In the 1970s, PROSPECTOR (also at Stanford University) was developed for mineral exploration. PROSPECTOR was also an early and valuable example of the use of inference networks.

Other famous and successful systems that followed in the 1970s were XCON (with some 10,000 rules), which was developed to help configure electrical circuit boards on VAX computers; GUIDON, a tutoring system that was an offshoot of Mycin; TEIRESIAS, a knowledge acquisition tool for Mycin; and HEARSAY I and II, the premier examples of speech understanding using the Blackboard Architecture.

The AM (Artificial Mathematician) system of Doug Lenat was another important result of research and development efforts in the 1970s, as well as the Dempster-Schafer Theory for reasoning under uncertainty, together with Zadeh’s work in fuzzy logic.

Since the 1980s, thousands of expert systems have been developed in such areas as configuration, diagnosis, instruction, monitoring, planning, prognosis, remedy, and control. Today, in addition to stand-alone expert systems, many expert systems have been embedded into other software systems for control purposes, including those in medical equipment and automobiles (for example, when should traction control engage in an automobile?).

In addition, many expert systems shells, such as Emycin, OPS, EXSYS, and CLIPS, have become industry standards. Many knowledge representation languages have also been developed. Today, numerous expert systems work behind the scenes to enhance day-to-day experiences, such as the online shopping cart.

Neural Computing

McCulloch and Pitts conducted early research in neural computing because they were trying to understand the behavior of animal nervous systems. Their model of artificial neural networks (ANN) had one serious drawback: it did not include a mechanism for learning.

Frank Rosenblatt developed an iterative algorithm known as the Perceptron Learning Rule for finding the appropriate weights in a single-layered network (a network in which all neurons are directly connected to inputs). Research in this burgeoning discipline might have been severely hindered by the pronouncement by Minsky and Papert that certain problems could not be solved by single-layer perceptrons, such as the exclusive OR (XOR) function. Federal funding for neural network research was severely curtailed immediately after this proclamation.

The field witnessed a second flurry of activity in the early 1980s with the work of Hopfield. His asynchronous network model (Hopfield networks) used an energy function to approximate solutions to NP-complete problems.

The mid-1980s also witnessed the discovery of back propagation (usually called backprop), a learning algorithm appropriate for multilayered networks. Back propagation-based networks are routinely employed to predict Dow Jones averages and to read printed material in optical character recognition systems.

Neural networks are also used in control systems. ALVINN was a project at Carnegie Mellon University in which a back propagation network senses the highway and assists in the steering of a Navlab vehicle. One immediate application of this work was to warn a driver impaired by lack of sleep, excess of alcohol, or other conditions whenever the vehicle strayed from its highway lane. Looking toward the future, it is hoped that, someday, similar systems will drive vehicles so that we are free to read newspapers and talk on our cell phones to take advantage of the extra free time.

Evolutionary Computation

Genetic algorithms are more generally classified as evolutionary computation. Genetic algorithms use probability and parallelism to solve combinatorial problems (also called optimization problems), which is an approach developed by John Holland.

However, evolutionary computation is not solely concerned with optimization problems. Rodney Brooks was formerly the director of the MIT Computer Science and AI Laboratory. His approach to the successful creation of a human-level Artificial Intelligence, which he aptly cites as the holy grail of AI research, renounces reliance on the symbol-based approach. This latter approach relies upon the use of heuristics and representational paradigms.

In his view, intelligent systems can be designed in multiple layers in which higher leveled layers rely upon those layers beneath them. For example, if you wanted to build a robot capable of avoiding obstacles, the obstacle avoidance routine would be built upon a lower layer, which would merely be responsible for robotic locomotion.

Brooks maintains that intelligence emerges through the interaction of an agent with its environment. He is perhaps most well known for the insectlike robots built in his lab that embody this philosophy of intelligence, wherein a community of autonomous robots interact with their environment and with each other.

Natural Language Processing

If we wish to build intelligent systems, it seems natural to ask that our systems possess a language-understanding facility. This is an axiom that was well understood by many early practitioners. Eliza is one well-known early application program, which was developed by Joseph Weizenbaum‚ an MIT computer scientist who worked with Kenneth Colby (a Stanford University psychiatrist).

Eliza was intended to imitate the role played by a psychiatrist of the Carl Rogers School. For instance, if the user typed in “I feel tired,” Eliza was a back propagation application that learned the correct pronunciation for English text. It was claimed to pronounce English sounds with 95% accuracy. Obviously, problems arose because of inconsistencies inherent in the pronunciation of English words, such as rough and through, and the pronunciation of words derived from other languages, such as pizza and fizzy.

Terry Winograd wrote another well-known program that was named after the second set of these letters of the pair ETAOIN SHRDLU, which are the most frequently used letters in the English language on linotype machines. Winograd’s program might respond with, “You say you feel tired. Tell me more.” The “conversation” would continue in this manner, with the machine contributing little or nothing in terms of originality to the dialogue. A live psychoanalyst might behave in this fashion in the hope that the patient would discover their true (perhaps hidden) feelings and frustrations. Meanwhile, Eliza is merely using pattern matching to feign humanlike interaction.