81,99 €
A practical guide to AI applications for Simple Python and Matlab scripts
Machine Learning and AI with Simple Python and Matlab Scripts: Courseware for Non-computing Majors introduces basic concepts and principles of machine learning and artificial intelligence to help readers develop skills applicable to many popular topics in engineering and science. Step-by-step instructions for simple Python and Matlab scripts mimicking real-life applications will enter the readers into the magical world of AI, without requiring them to have advanced math and computational skills. The book is supported by instructor only lecture slides and sample exams with multiple-choice questions.
Machine Learning and AI with Simple Python and Matlab Scripts includes information on:
Machine Learning and AI with Simple Python and Matlab Scripts is an accessible, thorough, and practical learning resource for undergraduate and graduate students in engineering and science programs along with professionals in related industries seeking to expand their skill sets.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 445
Veröffentlichungsjahr: 2025
Cover
Table of Contents
Title Page
Copyright
Dedication
About the Author
Preface
Acknowledgments
About the Companion Website
1 Introduction
1.1 Artificial Intelligence
1.2 A Historical Perspective
1.3 Principles of AI
1.4 Applications That Are Impossible Without AI
1.5 Organization of This Book
2 Artificial Neural Networks
2.1 Introduction
2.2 Applications of ANNs
2.3 Components of ANNs
2.4 Training an ANN
2.5 Forward Propagation
2.6 Back Propagation
2.7 Updating Weights
2.8 ANN with Input Bias
2.9 A Simple Algorithm for ANN Training
2.10 Computational Complexity of ANN Training
2.11 Normalization of ANN Inputs and Outputs
2.12 Concluding Remarks
2.13 Exercises for Chapter 2
3 ANNs for Optimized Prediction
3.1 Introduction
3.2 Selection of ANN Inputs
3.3 Selection of ANN Outputs
3.4 Construction of Hidden Layers
3.5 Case Study 1: Sleep‐Study Example
3.6 Case Study 2: Prediction of Bike Rentals
3.7 Concluding Remarks
3.8 Exercises for Chapter 3
4 ANNs for Financial Stock Trading
4.1 Introduction
4.2 Programs that Buy and Sell Stocks
4.3 Technical Indicators
4.4 A Simple Algorithmic Trading Policy
4.5 A Simple ANN for Algorithmic Stock Trading
4.6 Python Script for Stock Trading Using an ANN
4.7 Matlab Script for Stock Trading Using an ANN
4.8 Concluding Remarks
4.9 Exercises for Chapter 4
5 ANNs for Alzheimer's Disease Prognosis
5.1 Introduction
5.2 Alzheimer's Disease
5.3 A Simple ANN for AD Prognosis
5.4 Python Script for AD Prognosis Using an ANN
5.5 Matlab Script for AD Prognosis Using an ANN
5.6 Concluding Remarks
5.7 Exercises for Chapter 5
6 ANNs for Natural Language Processing
6.1 Introduction
6.2 Impact of Text Messages on Stock Markets
6.3 A Simple ANN for NLP
6.4 Python Script for NLP Using an ANN
6.5 Matlab Script for NLP Using an ANN
6.6 Concluding Remarks
6.7 Exercises for Chapter 6
7 Convolutional Neural Networks
7.1 Introduction
7.2 Variations of CNNs
7.3 Applications of CNNs
7.4 CNN Components
7.5 A Numerical Example of a CNN
7.6 Computational Cost of CNN Training
7.7 Concluding Remarks
7.8 Exercises for Chapter 7
8 CNNs for Optical Character Recognition
8.1 Introduction
8.2 A Simple CNN for OCR
8.3 Organization of Training and Reference Files
8.4 Python Script for OCR Using a CNN
8.5 Matlab Script for OCR Using a CNN
8.6 Concluding Remarks
8.7 Exercises for Chapter 8
9 CNNs for Speech Recognition
9.1 Introduction
9.2 A Simple CNN for Speech Recognition
9.3 Organization of Training and Reference Files
9.4 Python Script for Speech Recognition Using a CNN
9.5 Matlab Script for Speech Recognition Using a CNN
9.6 Concluding Remarks
9.7 Exercises for Chapter 9
10 Recurrent Neural Networks
10.1 Introduction
10.2 One‐to‐One Single RNN Cell
10.3 A Numerical Example
10.4 Multiple Hidden Layers
10.5 Embedding Layer
10.6 Concluding Remarks
10.7 Exercises for Chapter 10
11 RNNs for Chatbot Implementation
11.1 Introduction
11.2 Many‐to‐Many RNN Architecture
11.3 A Simple Chatbot
11.4 Python Script for a Chatbot Using an RNN
11.5 Matlab Script for a Chatbot Using an RNN
11.6 Concluding Remarks
11.7 Exercises for Chapter 11
12 RNNs with Attention
12.1 Introduction
12.2 One‐to‐One RNN Cell with Attention
12.3 Forward and Back Propagation
12.4 A Numerical Example
12.5 Embedding Layer
12.6 A Numerical Example with Embedding
12.7 Concluding Remarks
12.8 Exercises for Chapter 12
13 RNNs with Attention for Machine Translation
13.1 Introduction
13.2 Many‐to‐Many Architecture
13.3 Python Script for Machine Translation by an RNN‐Att
13.4 Matlab Script for Machine Translation by an RNN‐Att
13.5 Concluding Remarks
13.6 Exercises for Chapter 13
14 Genetic Algorithms
14.1 Introduction
14.2 Genetic Algorithm Elements
14.3 A Simple Algorithm for a GA
14.4 An Example of a GA
14.5 Convergence in GAs
14.6 Concluding Remarks
14.7 Exercises for Chapter 14
15 GAs for Dietary Menu Selection
15.1 Introduction
15.2 Definition of the KP
15.3 A Simple Algorithm for the KP
15.4 Variations of the KP
15.5 GAs for KP Solution
15.6 Python Script for Dietary Menu Selection Using a GA
15.7 Matlab Script for Dietary Menu Selection Using a GA
15.8 Concluding Remarks
15.9 Exercises for Chapter 15
Note
16 GAs for Drone Flight Control
16.1 Introduction
16.2 UAV Swarms
16.3 UAV Flight Control
16.4 A Simple GA for UAV Flight Control
16.5 Python Script for UAV Flight Control Using a GA
16.6 Matlab Script for UAV Flight Control Using a GA
16.7 Concluding Remarks
16.8 Exercises for Chapter 16
17 GAs for Route Optimization
17.1 Introduction
17.2 Definition of the TSP
17.3 A Simple Algorithm for the TSP
17.4 Variations of the TSP
17.5 GA Solution for the TSP
17.6 Python Script for Route Optimization Using a GA
17.7 Matlab Script for Route Optimization Using a GA
17.8 Concluding Remarks
17.9 Exercises for Chapter 17
18 Evolutionary Methods
18.1 Introduction
18.2 Particle Swarm Optimization
18.3 Differential Evolution
18.4 Grammatical Evolution
Appendix A: ANNs with BiasANNs with Bias
A.1 Introduction
A.2 Training with Bias Input
A.3 Forward Propagation
Appendix B: Sleep Study ANN with BiasSleep Study ANN with Bias
B.1 Inclusion of Bias Term in ANN
Appendix C: Back Propagation in a CNNBack Propagation in a CNN
Appendix D: Back Propagation Through Time in an RNNBack Propagation Through Time in an RNN
D.1 Back Propagation in an RNN
D.2 Embedding Layer
Appendix E: Back Propagation Through Time in an RNN with AttentionBack Propagation Through Time in an RNN with Attention
E.1 Back Propagation in an RNN‐Att
E.2 Embedding Layer
Bibliography
Index
End User License Agreement
Chapter 10
Table 10.1 Previous and current inputs and expected outputs for the simplis...
Chapter 12
Table 12.1 Previous and current inputs and expected outputs with their one‐...
Chapter 2
Figure 2.1 An ANN with two hidden layers.
Figure 2.2 An ANN neuron employing summation and activation functions.
Figure 2.3 (a) Sigmoid activation function, and (b) its derivative.
Figure 2.4 (a) Rectilinear activation function, and (b) its derivative.
Figure 2.5 An ANN with two inputs, one hidden layer and one output, whose ag...
Figure 2.6 Example ANN with two inputs, one hidden layer and one output (Ada...
Figure 2.7 A neuron with two inputs...
Figure 2.8 Example ANN with two inputs with bias, one hidden layer and one o...
Figure 2.9 A simple algorithm describing the ANN training process.
Chapter 3
Figure 3.1 Data samples for the sleep‐study example: for each sample, the in...
Figure 3.2 A fully‐connected ANN for the sleep‐study example using one hidde...
Figure 3.3 (a) Sample data used for sleep‐study example, and (b) screenshot ...
Figure 3.4 Sample screenshot generated by the Matlab script for the sleep‐st...
Figure 3.5 ANN training data with five inputs and one output to be used for ...
Figure 3.6 A fully connected ANN for bike rental example using two hidden la...
Figure 3.7 Field data used for training the ANN shown in Figure 3.6.
Figure 3.8 Python script for bike rentals example predicts 822 bike rentals ...
Figure 3.9 Matlab script for bike rentals given above predicts 928 rentals f...
Chapter 4
Figure 4.1 NYSE trading floor: (a) when trading was done by stock brokers (C...
Figure 4.2 SMA smooths out fluctuations in daily stock movements.
Figure 4.3 Momentum of price movements of S&P...
Figure 4.4 EMA is more sensitive to price changes than SMA, as shown here fo...
Figure 4.5 Bolinger bands for a fictitious stock computed using 20‐day SMA a...
Figure 4.6 50‐day and 200‐day SMA price curves for Bitcoin (taken from [50] ...
Figure 4.7 Simplistic ANN with two hidden layers for stock price prediction....
Figure 4.8 Sample tick data for Microsoft stock.
Figure 4.9 Sample output of the Python script.
Figure 4.10 Sample output generated by the Matlab script presented above, sh...
Chapter 5
Figure 5.1 Gene expressions are selected as ANN inputs; a few of them are sh...
Figure 5.2 Slope of MMSE scores are selected as the ANN outputs; a sample su...
Figure 5.3 ANN architecture 120‐60‐15‐5‐1, where the inputs are the expressi...
Figure 5.4 Sample error rates generated by the Python script for four patien...
Figure 5.5 Results generated by the Python script implementing an ANN with 1...
Figure 5.6 Sample results generated by the Matlab script implementing an ANN...
Chapter 6
Figure 6.1 NLP algorithms can explore possible associations between various ...
Figure 6.2 Keywords selected for the NLP example.
Figure 6.3 Subset of tweets used in forming the ANN inputs for the NLP examp...
Figure 6.4 Sample tweets given in Figure 6.3 are processed so that the numbe...
Figure 6.5 Sample historical tick data used in forming ANN inputs and output...
Figure 6.6 Two input examples for an ANN implementing the NLP example: the f...
Figure 6.7 Fully‐connected ANN architecture to be used for NLP implementatio...
Figure 6.8 Output generated by a sample run of the Python script: out of 100...
Figure 6.9 Output generated by a sample run of the Matlab script: out of 100...
Chapter 7
Figure 7.1 Feature recognition by the visual hierarchy of brain cortices (ad...
Figure 7.2 Main components of a CNN for colored images: one convolution, poo...
Figure 7.3 A simple algorithm for training a CNN.
Figure 7.4 Layers of a simple CNN.
Figure 7.5 Two classes of images that the example CNN is expected to recogni...
Figure 7.6 Two filters defined for the example application of a CNN.
Figure 7.7 Example input image to be identified as either an
X
or an
O
by th...
Figure 7.8 Convolutions with Filter 1.
Figure 7.9 Convolutions with Filter 1 (continued).
Figure 7.10 Convolutions with Filter 2.
Figure 7.11 Convolutions with Filter 2 (continued).
Figure 7.12 RELU operations: (a) form matrices, (b) apply RELU to ...
Figure 7.13 PL operations, where a pooling size of ...
Figure 7.14 Unrolling layer operations: (a) ...
Figure 7.15 Fully‐connected ANN used in CNN for this example with ...
Figure 7.16 CNN convolutional layer – dimensions of input and output matrice...
Figure 7.17 Zero padding prevents loss of information during convolution ope...
Figure 7.18 CNN pooling layer input and output matrix dimensions.
Figure 7.19 CNN pooling layer output and unrolling layer input matrices.
Figure 7.20 CNN unrolling layer outputs as ANN inputs.
Figure 7.21 A numerical example for the computational cost of CNN training f...
Chapter 8
Figure 8.1 Essential information for OCR applications in MNIST website: (a) ...
Figure 8.2 CNN example for the simple OCR application presented in this chap...
Figure 8.3 Image dimensions as it is processed through CNN layers: An input ...
Figure 8.4 In this sample implementation, images used for training the CNN a...
Figure 8.5 In this sample implementation, when an input image is identified,...
Figure 8.6 Sample run of the Python script given in Section 8.4, where a tra...
Figure 8.7 Creation of reference and sample training images using
create_new
...
Figure 8.8 A sample run of the Matlab scripts given in Section 8.5: (a) a CN...
Chapter 9
Figure 9.1 A speech spectrogram; each black pixel represents the frequencies...
Figure 9.2 A typical CNN architecture example for speech recognition applica...
Figure 9.3 Spectrogram dimensions as it is processed through CNN layers: The...
Figure 9.4 Audio files stored in the
TrainingSet
folder are used to train th...
Figure 9.5 The
ReferenceSet
folder has sub‐folders corresponding to the diff...
Chapter 10
Figure 10.1 Input, hidden and output layers of an RNN and a recurrent connec...
Figure 10.2 A one‐to‐one single RNN cell: (a) compressed, and (b) unfolded v...
Figure 10.3 Input, weight and output vector dimensions in a one‐to‐one singl...
Figure 10.4 Basic parts of a one‐to‐one RNN with multiple hidden layers (com...
Figure 10.5 A one‐to‐one RNN cell with two hidden layers.
Figure 10.6 Vector dimensions in the one‐input to one‐output two‐cell RNN sh...
Figure 10.7 Vector dimensions in a one‐to‐one RNN with one embedding layer a...
Chapter 11
Figure 11.1 A many‐to‐many RNN with multiple hidden layers used in a chatbot...
Figure 11.2 A subset of samples used for training in Python and Matlab imple...
Figure 11.3 Weight matrix dimensions in a many‐to‐many RNN with 256 hidden l...
Figure 11.4 Weight matrix dimensions in a many‐to‐many RNN with 256 hidden l...
Figure 11.5 Sample outputs generated by the Python script presented in Secti...
Figure 11.6 Sample outputs generated by the Matlab script presented in Secti...
Chapter 12
Figure 12.1 A one‐to‐one single RNN cell with attention.
Figure 12.2 Four steps of a one‐to‐one single RNN‐Att cell, where input ...
Figure 12.3 A one‐to‐one single RNN‐Att cell with an embedding layer.
Figure 12.4 Four steps of a one‐to‐one single RNN‐Att cell using an embeddin...
Chapter 13
Figure 13.1 English to Polish translation example, where the Polish sentence...
Figure 13.2 A many‐to‐many RNN‐Att with multiple hidden layers used for a ma...
Figure 13.3 A many‐to‐many RNN‐Att with multiple hidden and embedding layers...
Figure 13.4 A subset of training data for our RNN‐Att implementing a simplis...
Figure 13.5 Sample responses generated by the Python script presented in Sec...
Figure 13.6 Sample responses generated by Matlab script presented in Section...
Chapter 14
Figure 14.1 Examples of fitness functions and individuals with good and bad ...
Figure 14.2 Basic operations of a GA.
Figure 14.3 A simple GA implementation.
Figure 14.4 An example of a single‐point crossover operation in a GA.
Figure 14.5 An example of a two‐point crossover operation in a GA.
Figure 14.6 Initial population and selection of parents in a GA.
Figure 14.7 Reproduction of offspring in a GA.
Figure 14.8 Selection in a GA.
Chapter 15
Figure 15.1 Example menu items for a fictitious movie theatre menu [110]; An...
Figure 15.2 A simple example of the KP – given a knapsack of 15 kg capacity,...
Figure 15.3 Possible item selections to be placed into the knapsack shown in...
Figure 15.4 Solutions for KP shown in Figure 15.2.
Figure 15.5 A simple algorithm to find the best combination of items for KP....
Figure 15.6 Possible selections of items to be placed into the knapsack show...
Figure 15.7 Example list of food items and their calorie and protein values....
Figure 15.8 Sample runs of Python script presented in Section 15.6: (a) 277 ...
Figure 15.9 Parameters for GA implementation in Matlab.
Figure 15.10 Sample runs of the Matlab script presented in Section 15.7 gene...
Figure 15.11 KP exercises: (a) information for shopping planning, (b) inform...
Chapter 16
Figure 16.1 Examples of swarms by different‐sized drones: (a) swarm of 103 s...
Figure 16.2 A simple algorithm for GA‐based UAV flight control.
Figure 16.3 Node
i
and its two near neighbours, namely nodes
j
and
k
; node
m
Figure 16.4 FGA operation for node ...
Figure 16.5 Chromosome for UAV movement: (a) the most significant two bits f...
Figure 16.6 Enumeration of two‐dimensional movement directions by a chromoso...
Figure 16.7 Displacement vectors ‐ examples in 2D movement.
Figure 16.8 Node displacement for each possible chromosome value: computatio...
Figure 16.9 Final positions of a swarm of 10 UAVs guided by the Python scrip...
Figure 16.10 Final positions of a swarm of 10 UAVs guided by FGA‐based fligh...
Chapter 17
Figure 17.1 Example of the TSP: (a) finding the shortest possible loop that ...
Figure 17.2 Example of the TSP: (a) the goal is to find a loop that connects...
Figure 17.3 Covering cities and towns in Sweden and Germany with near‐optima...
Figure 17.4 A near‐optimal tour of 3100 county seats in the USA is 93 466 mi...
Figure 17.5 TSP solution attempts to find the shortest cycle starting and fi...
Figure 17.6 A simple algorithm to find the lowest‐cost Hamiltonian cycles in...
Figure 17.7 Cycles starting and finishing in Detroit: (a) TSP cycle (all cit...
Figure 17.8 An example graph, where each undirected edge is associated with ...
Figure 17.9 Two candidate solutions for the TSP presented in Figure 17.8.
Figure 17.10 Covering a given set of cities as a TSP example: (a) set of cit...
Figure 17.11 Connectivity matrix for the cities shown in Figure 17.10.
Figure 17.12 (a) Crossover applied to two parents generates invalid offsprin...
Figure 17.13 (a) Mutation operator generates an invalid offspring with dupli...
Figure 17.14 Sample runs of the Python script presented in Section 17.6: (a)...
Figure 17.15 Sample runs of the Matlab script presented in Section 17.7: the...
Figure 17.16 (a) Caribbean nations, (b) former British Commonwealth countrie...
Chapter 18
Figure 18.1 A simple implementation of PSO.
Figure 18.2 Rosenbrock function as a benchmark for PSO.
Figure 18.3 Numerical example for PSO using Rosenbrock function: (a) PSO par...
Figure 18.4 The fitness values and personal and global best positions for al...
Figure 18.5 New positions of the five particles (note that, for brevity, the...
Figure 18.6 The fitness values and personal and global best positions for al...
Figure 18.7 New positions of the five particles.
Figure 18.8 A simple algorithm implementing DE (here smaller fitness values ...
Figure 18.9 Examples of generating a trial solution – all possible cases.
Figure 18.10 (a) Benchmark for DE: finding maximum of
sinc
function and (b) ...
Figure 18.11 (a) Initial population of DE and (b) the vectors depicting the ...
Figure 18.12 Candidate positions at generation 1...
Figure 18.13 Positions of candidates for maximizing the
sinc
function: (a) i...
Figure 18.14 Candidate positions at generation 2 ...
Figure 18.15 Positions of candidates for maximizing the
sinc
function: (a) a...
Figure 18.16 Rosenbrock function as a benchmark for PSO.
Figure 18.17 Santa Fe trail: there are 89 pieces of food on a 32x32 toroidal...
Figure 18.18 An example BNF for the Santa Fe trail [107].
Figure 18.19 Best‐performing output program generated using the BNF grammar ...
Figure 18.20 A simple BNF for an artificial ant.
Figure 18.21 The genotype and the corresponding expression tree generated us...
Figure 18.22 Steps (a)–(h): Generating a phenotype from a given genotype usi...
Figure 18.23 Steps (i)–(m): Phenotype generation continued from Figure 18.22...
Figure 18.24 An example chromosome with arbitrarily large codon values which...
Figure 18.25 Computing the fitness of the program given in the top‐left corn...
Appendix A
Figure A.1 A neuron with bias input at the output layer of an ANN.
Figure A.2 Summation and activation in hidden‐layer neurons.
Figure A.3 Example ANN with two inputs with bias, one hidden layer and one o...
Figure A.4 A neuron with two inputs ...
Appendix B
Figure B.1 A simple ANN with bias input for sleep‐study example.
Figure B.2 A neuron with bias during back propagation for sleep‐study exampl...
Appendix C
Figure C.1 Fully‐connected ANN used in CNN for this example with ...
Figure C.2 Mapping maximum values into corresponding regions in input image ...
Cover
Table of Contents
Series Page
Title Page
Copyright
Dedication
About the Author
Preface
Acknowledgments
About the Companion Website
Begin Reading
Appendix A ANNs with Bias
Appendix B Sleep Study ANN with Bias
Appendix C Back Propagation in a CNN
Appendix D Back Propagation Through Time in an RNN
Appendix E Back Propagation Through Time in an RNN with Attention
Bibliography
Index
End User License Agreement
ii
iii
iv
v
xiii
xv
xvii
xix
1
2
3
4
5
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
225
226
227
228
229
230
231
232
233
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
317
318
319
320
321
322
323
324
325
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
343
344
345
346
347
348
349
350
351
353
354
355
356
357
IEEE Press
445 Hoes Lane Piscataway, NJ 08854
IEEE Press Editorial Board
Sarah Spurgeon,
Editor in Chief
Moeness Amin
Ekram Hossain
Desineni Subbaram Naidu
Jón Atli Benediktsson
Brian Johnson
Tony Q. S. Quek
Adam Drobot
Hai Li
Behzad Razavi
James Duncan
James Lyke
Thomas Robertazzi
Joydeep Mitra
Patrick Chik Yue
M. Ümit Uyar
The City College of New YorkNew York, USA
Copyright © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per‐copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750‐8400, fax (978) 750‐4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748‐6011, fax (201) 748‐6008, or online at http://www.wiley.com/go/permission.
The manufacturer's authorized representative according to the EU General Product Safety Regulation is Wiley-VCH GmbH, Boschstr. 12, 69469 Weinheim, Germany, e-mail: [email protected].
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762‐2974, outside the United States at (317) 572‐3993 or fax (317) 572‐4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging‐in‐Publication Data Applied for
Hardback ISBN: 9781394294954
Cover Design: WileyCover image is a painting titled “Yeni Arkadaş” (New Friend) by Mümtaz Yener (private collection of his daugther Prof. Dr. Göksun Nina Say-Yener)Author Photo: Courtesy of M. Ümit Uyar
To my beloved daughtersAylin Emine and Melisa Ayşe
Dr. M. Ümit Uyar is a professor at the City College of the City University of New York. He is the co‐founder and current director of the Computer Engineering program at the City College. He was the lead and co‐principal investigator for large grants to conduct research on AI and game theory‐based autonomous software agents, knowledge sharing mobile agents using bio‐inspired algorithms for topology control in mobile networks, tactical interoperability of combat networks and efficient reliable end‐to‐end communications.
Based on his research experience in civilian and military telecommunication protocols and mobile ad hoc computer networks, Dr. Uyar developed AI and game theory algorithms for large swarms of autonomous drones and intelligent techniques for personalized prognosis of degenerative diseases such as Alzheimer's disease and cancer. While in industry, he developed widely used software tools that improved the software development cycle by orders of magnitude. He has published extensively in AI, game theory and formal description techniques used for complex telecommunication protocols, edited books and co‐chaired international conferences.
Prior to joining academia, he was a Distinguished Member of Technical Staff at AT&T Bell Labs. He is an IEEE Fellow and holds seven US patents. Dr. Uyar has a BS degree from İstanbul Teknik Üniversitesi, and MS and PhD degrees from Cornell University, all in electrical engineering.
This book is an introduction to artificial intelligence (AI)‐based problem solving techniques, together with the basic concepts they originated from and their computational principles. Readers will see how several real‐life problems can be modelled so that AI methods can be applied to solve them.
Each chapter focuses on a realistic challenge but with reduced dimensions, presenting a simple, easy‐to‐follow solution as both Python and Matlab scripts. Projects include making financial stock market predictions based on text messages from famous people, algorithmic trading of financial assets, bike rental predictions for the City of London, personalized prediction of Alzheimer's disease based on genetic information, speech recognition, chatbot implementation and translation of written text from one language to another.
The emphasis on these projects is not to write programs to implement the AI algorithms in computer languages but rather to demonstrate the relative simplicity of the AI algorithms used to solve these problems.
After finishing this book, it will become apparent to the reader that although much has been accomplished in the world of AI, there is still much more to be learned. This book will prepare them for tackling a wide range of complex problems by applying cutting‐edge AI techniques and advancing on new frontiers.
About the cover:Yeni Arkadaş (New Friend) by Mümtaz Yener (private collection of his daughter Prof. Dr. Nina Göksun Say‐Yener).
Mümtaz Yener (1918–2007) worked and lived in Istanbul as a painter with his best friend and wife Şadan Yener. His extensive artwork has been critically acclaimed around the world and featured in renowned museum exhibitions and private art collections in Turkey, Europe, the United States, Japan and Brazil. He believed that “freedom depends on the amount of domination that humans have over machines.”
I am grateful for the comments that I received about the materials presented in this book from the undergraduate and graduate students who have taken my classes at the City University of New York over the years. Their relentless pursuit of knowledge has always been refreshing. Specifically, I acknowledge the efforts of the following past and present students of the City College of New York:
Olga Chsherbakova for speedy implementation of Python and Matlab scripts
Hasan Şuca Kayman for his skill in deriving complex gradients and implementing scripts for recurrent neural network and attention models
Dr. Samrat S. Batth for contributions to the chapter on financial applications
Michal Kropiewnicki for critical reading and constructive comments
Kelvin Ma for his contributions to the manuscript and projects
Clement McClean for preparing student assignments with admirable ease
Grace McGrath, Joe Malubay, Jian Wen Choong, Ishmam Fardin and Sultana Begum for contributions to the Alzheimer's disease prognosis scripts
Ricardo Valdez for his contributions to the manuscript and projects and for his derivation of ANN examples with an impressive show of patience
Alexander Verzun and Roberto Behar for corrections to the text and scripts
I thank my friend and colleague Prof. Janusz Kusyk for his invaluable contributions on all concepts in this book throughout the chapters.
I thank Prof. Dr. Nina Göksun Say‐Yener for generously sharing Mümtaz Yener's masterpiece Yeni Arkadaş (New Friend) as the cover of this book.
I thank the Wiley team for their support and understanding: Mary Hatcher, Brett Kurzman, Ryan Coach, Dimple Philip, Vijayalakshmi Saminathan and Akhil Ajikumar. Special thanks go to Chris Cartwright for his meticulous editing.
Lastly, I would like to thank Aylin E. Uyar for simplifying complex concepts covered throughout and making them more accessible to a wider audience, including readers without a software engineering background.
August 2024
M. Ümit Uyar, PhD
The City College of New York, New York, USA
This book is accompanied by a companion website:
www.wiley.com/go/UyarSimplePythonandMatlabScripts1e
This website includes an instructor companion site with Python and Matlab code and scripts.
Artificial intelligence (AI) has become one of the most exciting and active fields of research in recent years. AI attempts to develop software and computing devices that are able to perform tasks which have been associated only with thinking beings for centuries. Mimicking intelligence, creativity, deduction capability and ability to learn from experience are some of the directions that AI undertakes. AI's recent ubiquitous presence in everyone's life makes this technology an integral part of daily activities that is often taken for granted. Most people treat AI tools as black boxes (i.e. devices whose outputs are observable but whose internal details are unknown by users) which spit out solutions to make problems go away. A reader of this book will quickly appreciate that there is no magic behind the way AI operates. By understanding the mechanisms employed in many popular AI techniques, correct solutions are easily obtained for complicated tasks. Through hands‐on programming projects, the reader will be able to grasp how easily many otherwise unsolvable problems can be handled using AI.
The question of whether a human‐created device can be intelligent goes back to the middle of the twentieth century when the so‐called Turing test was proposed to examine whether or not a machine is capable of thinking [1, 2]. The Turing test assumes that a human interrogator asks questions to both another human and a computer through an impediment that prevents identification of the responders. The proposition of the Turing test is that if the interrogator cannot distinguish which answers come from whom, then the interrogated human and the machine have the same intelligence. The Turing test has been used ever since to judge whether a computer can think. The term artificial intelligence was coined in 1955, six years after the Turing test was introduced, as the science and engineering for making intelligent machines together with the Turing test are considered as the beginning of modern AI [3].
The path from initial theoretical AI research to its realistic applications has progressed through many successful breakthroughs. One of the prominent milestones for engineers was marked by the introduction of the mobile robot Shaky in 1966 [4]. Shaky was controlled by so‐called intelligent algorithms that analysed its surroundings to carry out a plan for an intended goal. On another path, it has long been believed that being able to effectively play a chess game is a good indicator for the effectiveness of AI. IBM developed a chess‐playing computer called Deep Blue that defeated the chess grandmaster Garry Kasparov, the chess champion of the world at that time, in a historic match in 1997 [5]. Recent developments in self‐driving cars would not have been possible without AI‐based algorithms, which are essential for replicating the complex decision processes of a human driver [6]. In 2015, car maker Tesla marked one of the greatest milestones by announcing fully automated self‐driving vehicles. By equipping its cars with AI software for interpreting and understanding the visual world, Tesla showed that robust path planning and real‐time decision making were within reach [7]. These are only a few of the highlights marking recent developments in AI. It is widely believed that the importance of AI will only grow in years to come.
An AI can be classified by the areas for which it is designed to find solutions. Currently, these areas include natural language processing, computer vision, automatic programming, robotics and intelligent data retrieval systems. However, grouping AI by the computational concepts employed for solving problems is a more accurate and better‐suited classification for the readers of this book.
Searching is one of the computational methods that can be used for various AI tasks such as reasoning (e.g. finding inference roles), planning (e.g. search through goals and sub‐goals) and moving (e.g. exploring a surrounding space by robots). Common AI tools to provide intelligent decisions in situations with incomplete or uncertain information can involve engaging Bayesian models, probabilistic algorithms and approaches derived from decision theory. Evolutionary algorithms are problem solving methods that attempt to replicate evolutionary operations of biological organisms in searching through a solution space. They are often used in AI to learn from previous experiences and to find solutions in large and complex settings. Learning principles of AI can be implemented using artificial neural networks and its variants, including convolutionalneural networks and recurrentneural networks (RNNs), which allow for a system to be trained on a known set of data and for subsequent prediction of outcomes from new unknown events. Generative AI refers to a class of artificial intelligence systems that are designed to create new content, such as text, images, audio or code, by learning patterns from existing data. Unlike traditional AI, which often focuses on recognizing patterns or making predictions, generative AI models use those patterns to produce novel outputs that resemble the input data but are not mere copies. These systems are typically built using techniques like deep learning and neural networks, making them capable of generating realistic and creative content in various domains. Examples of generative AI methods include RNNs, long short term memory (LSTM) [8] and transformers [9].
Starting from the early days of the twenty‐first century, AI has been used in countless applications ranging from solving a trivial task of suggesting emojis in text messages [10] to design of flight‐control software for commercial aeroplanes [11]. Some of the most prominent uses of AI are in the areas of medical diagnosis, image processing, control of autonomous vehicles and prediction of real‐life events. In many similar tasks, an abundance of data (e.g. camera input, radar reading, proximity sensor) has to be processed in real time, something at which AI is especially good.
Applications of AI can save lives, literally. In the healthcare field, high‐risk patients can be successfully monitored remotely with AI‐based voice assistants. Automated physician assistance systems guided by AI methods are capable of generating questions to best diagnose a patient [12] and in finding an optimal and personalized treatment for them [13,14]. During surgery, an AI‐based system can guide a surgeon's scalpel so that soft tissue damage is minimized and the resulting data collected during the procedures are quickly interpreted in clinical context [15].
AI can control an autonomous vehicle in places when a human operator cannot or does not want to operate. Self‐driving cars [16], unmanned ground vehicles, drones and underwater vehicles [17–19] are only a few examples where the advantages of AI are directly visible.
Predicting the future may be one of the most powerful desires for many of us. Although AI cannot find answers to important existential questions, it can be helpful in guessing the probabilities of many future events [20]. For example, in an attempt to prepare for earthquakes, a combination of evolutionary algorithms, swarm intelligence and artificial neural networks is being used to forecast possible future seismic events. The natural language processing methods of AI can be employed in economics to estimate upcoming changes in trading markets [21] and in judicial applications to predict the outcome of a court ruling based on court proceedings [22].
This book is a beginner's guide to exciting and modern applications of AI techniques. The reader will be introduced to the basic concepts and principles of AI in order to develop skills readily applicable to real engineering projects. Through step‐by‐step instructions, the reader will learn how to implement typical AI tasks including control of autonomous drones, speech and character recognition, natural language processing, dietary menu planning, optimal selections for project management tasks and maximizing profits in algorithmic stock trading. We believe that all readers, regardless of their area of expertise, will enjoy an empowerment stemming from seeing relatively simple AI procedures used to tackle otherwise formidable problems.
This book is written with a diverse spectrum of readers in mind: engineers, scientists, economists and all backgrounds of students ranging from engineering to liberal arts. It welcomes all readers who would like to enter the realm of AI‐based problem solving as it has become an essential skill for professionals in the twenty‐first century.
In Chapter 2, the general concepts of an artificial neural network (ANN) architecture and its training process are introduced. As one of the most popular tools in AI in handling tasks that mimic events recorded in large amounts of data, ANN‐based AI tools dominate the applications that impact both our daily lives and scientific discoveries from astronomy to the social sciences. However, when we focus on their operation, we find the process rather simple and straightforward: a given set of sample data inputs are modified by a set of weights and activation functions so that the outputs match the real data points measured in an experiment or event. We explore this process and present a step‐by‐step analysis of ANN training.
In Chapter 3, we present two simplified but illustrative examples for optimized prediction using ANNs. The first is the simple but popular ANN example for predicting an exam grade based on the number of hours that a student studied and slept for this exam. The second example, inspired by the data from bike rental companies operating in London, UK, involves constructing and training an ANN to predict the number of bike rentals based on sample data such as the day of the week, time of day, temperature and wind speed. We present sample Python and Matlab scripts implementing these ANNs.
Chapter 4 explores the use of AI in the financial field. We introduce a simplistic ANN model using popular financial technical indicators computed using daily stock market data as inputs to predict the price change for the following day as its output. Using existing libraries in Python and Matlab, simplistic scripts are introduced for such an ANN.
Chapter 5 introduces the reader to AI applications in the life sciences. A simple case study is presented, where medical data collected from Alzheimer's disease (AD) patients are used to train an ANN which exploits a possible relationship between expressions of AD‐related genes and cognitive exam scores. This ANN then can be deployed as a tool to predict personalized disease progression for new patients.
Chapter 6 presents a natural language processing (NLP) application, where an ANN is set up to explore a possible relationship between the tweets from a former president of the USA and financial assets listed in public markets. ANN inputs include keywords extracted from tweets and daily financial stock market data, and the output is the predicted price change for the following day.
In Chapter 7, we introduce the main concepts governing the operations and capabilities of convolutionalneural networks (CNNs). These are ideal for identification (or classification) of multi‐dimensional inputs such as images and speech. We study the steps of training a CNN and modification of weights for more accurate predictions. Chapters 8 and 9 present two very popular applications of CNNs, namely optical character recognition (OCR) and speech recognition, respectively. For each chapter, simple but illustrative CNNs are implemented in Python and Matlab scripts.
Recurrent neural networks (RNNs), which are the building blocks of language‐related applications, are introduced in Chapter 10. RNNs are fundamentally different to ANNs and CNNs because RNNs can handle a sequence of inputs, where prior inputs have an impact on future outputs. This capability of possessing a memory of previous inputs make RNNs a vital part of applications such as chatbots and machine translation. Training process for RNNs including forward and back propagation operations is described in detail. Chapter 11 presents a simplistic chatbot architecture, which is then implemented in Python and Matlab scripts. This chatbot is trained with a few hundred text message exchanges taken from the internet and demonstrates limited attempts in generating responses for original user queries within the same dictionary of words.
In Chapter 12, the RNN with attention mechanism (RNN‐Att) is introduced, which makes use of historical inputs in an even more pronounced manner. For example, in natural language translation, one has to wait until a sentence is complete before translating it to another language since even the last words in a sentence may affect the translation significantly. We study the training process, weight updating and reduction of error through forward and back propagation operation in an RNN‐Att. In Chapter 13, we present a simplistic RNN‐Att architecture for machine translation and its implementation in Python and Matlab scripts. The example translator is trained using a few thousand German sentences and their English translations, and then demonstrates a remarkable capacity to attempt translating new German sentences into English (using only the words in the training dictionary).
In Chapters 14 to 18, we present prominent bio‐inspired computation and evolutionary methods in AI, which are typically applied to problems that are intractable or computationally too intensive to be solved using classical methods. In Chapter 14, one of the most popular bio‐inspired techniques, the genetic algorithm (GA), is introduced. With their relatively simple design, low cost computational requirements and independence from software development platforms, GAs are the perfect candidates for many real‐life optimization problems in a wide variety of settings. In the following chapters, we cover applications of GA, namely finding dietary menu selections under nutritional restrictions (i.e. solving the ‘knapsack’ class of problems) in Chapter 15, flight control of autonomous drones in Chapter 16 and finding optimum routes for travel (i.e. solving the travelling salesman class of problems) in Chapter 17. In Chapter 18, we outline evolutionary computation methods including particleswarmoptimization, differentialevolution and grammaticalevolution.
By introducing them to such a wide spectrum of AI methods, we hope that readers will find exciting and powerful new directions in their careers, as many of our students have done over the years.
Biological nervous systems process information using nerve cells called neurons that transmit messages to other cells using connections referred to as synapses. Information in the brain is stored by strengthening synapses among a group of neurons through a reinforcement process that repeatedly interchanges messages among these cells. Neurons operate in groups and react to inputs applied to them by further propagating these inputs to other neurons only over the strengthened synapses. No single neuron stores any particular piece of information, but a group of neurons together with established synapses represent information in the brain [23].
Brain‐inspired systems, called artificial neural networks (ANNs), intend to replicate the way that humans learn. ANNs are built of nodes (i.e. artificial neurons) which are interconnected by directed links (i.e. synapses) representing a connection from the output of one artificial neuron to the input of another. Each link in this system is associated with a weight, which alters the information as it travels from one neuron to another. In an ANN, nodes are organized in layers responsible for processing information in sequence. Signals are processed starting from the input layer, passing intermediate hidden layers until reaching the output layer, which generates observable outputs. Figure 2.1 shows an example of an ANN consisting of an input, two hidden and one output layers. Note that an engineer designing an ANN selects the number of nodes and layers depending on the requirements of the problem to be solved. This design process requires experience and specific domain knowledge for each application at hand.
Figure 2.1 An ANN with two hidden layers.
ANNs may be employed in solving many engineering problems ranging from cybersecurity [24] to image processing [25]. In general, applications of ANNs fall into one of three categories: classification, prediction or optimization problems. In classification problems, a real‐life observation is recognized as a particular category. One example is setting up an ANN to classify activities of a credit card as either legitimate or fraudulent actions [26]. In this example, an ANN can be built using legitimate actions on a given credit card based on the historic usage data of this card (e.g. types of items bought, places visited, times of shopping, etc.). When an interaction does not fit into the class of regular actions, it will be flagged as suspicious activity. Another example of a classification problem involves recognizing handwritten characters as particular letters or digits [27].
Prediction problems aim to estimate unknown outcomes based on historical data. For example, an ANN can be built using past atmospheric data to predict future precipitation patterns and temperature levels [28]. Similarly, in stock market predictions, using historical data (e.g. recent market behaviour, geopolitical situations and commodity prices), an ANN can be set up to predict whether a given stock is expected to gain or lose value [29].
In optimization problems, valid outcomes, which often have to satisfy multiple and possibly conflicting constraints, must be found in typically very large solution spaces. For example, finding an optimal dose of a chemotherapy drug to be administered to a cancer patient based on personal tumour growth data and overall health characteristics (e.g. heart condition, blood‐sugar measurements, physiological indicators and others) may be addressed by an optimization‐type ANN.
Figure 2.2 An ANN neuron employing summation and activation functions.
In an ANN, each node (i.e. neuron) performs a calculation to determine if its input signals should be forwarded to the next layer. This task is performed by so‐called summation and activation functions. First, a summation function adds the incoming signal strengths, and then an activation function makes a decision whether or not to propagate them further to other layers of the ANN. An example of summation and activation functions embedded in an artificial neuron are shown in Figure 2.2, where inputs of and with weights and , respectively, are first summed as a combined signal . An activation function then determines if the resulting signal should be forwarded to the next layer. Most common activation functions are in the form of sigmoid or rectilinear functions as discussed below. A sigmoid activation function transforms its input into an output ranging between 0.0 and 1.0, which is useful when an output is probabilistic. A rectilinear activation function, on the other hand, generates an output signal equal to the input signal strength for all positive signal values, and outputs zero for any input with non‐positive values.
The Sigmoid is convenient to efficiently calculate gradients used in ANN training (Figure 2.3). Sigmoid activation limits neuron output within the interval of :
The derivative of the sigmoid is given by
Figure 2.3 (a) Sigmoid activation function, and (b) its derivative.
Figure 2.4 (a) Rectilinear activation function, and (b) its derivative.
The rectilinear activation function (ReLU) and its derivative are shown in Figure 2.4. It is cheaper to calculate gradients used in ANN training with rectilinear activation.
The derivative of the rectilinear function is 1 for positive values of and 0 for negative values:
The rectilinear function prevents negative signals from propagating further in an ANN. However, in the output layer of an ANN, eliminating negative values via a rectilinear function may be problematic if the outputs of the ANN are allowed to be negative. In these cases, rectilinear activation functions should not be employed in the output layer.