100,99 €
SYSTEMS ENGINEERING NEURAL NETWORKS
A complete and authoritative discussion of systems engineering and neural networks
In Systems Engineering Neural Networks, a team of distinguished researchers deliver a thorough exploration of the fundamental concepts underpinning the creation and improvement of neural networks with a systems engineering mindset. In the book, you’ll find a general theoretical discussion of both systems engineering and neural networks accompanied by coverage of relevant and specific topics, from deep learning fundamentals to sport business applications.
Readers will discover in-depth examples derived from many years of engineering experience, a comprehensive glossary with links to further reading, and supplementary online content. The authors have also included a variety of applications programmed in both Python 3 and Microsoft Excel.
The book provides:
Perfect for students and professionals eager to incorporate machine learning techniques into their products and processes, Systems Engineering Neural Networks will also earn a place in the libraries of managers and researchers working in areas involving neural networks.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 307
Veröffentlichungsjahr: 2023
Cover
Title Page
Copyright
Dedication
About the Authors
Acknowledgements
How to Read this Book
Part I: Setting the Scene
1 A Brief Introduction
1.1 The Systems Engineering Approach to Artificial Intelligence (AI)
1.2 Chapter Summary
Questions
2 Defining a Neural Network
2.1 Biological Networks
2.2 From Biology to Mathematics
2.3 We Came a Full Circle
2.4 The Model of McCulloch-Pitts
2.5 The Artificial Neuron of Rosenblatt
2.6 Final Remarks
2.7 Chapter Summary
Questions
Sources
3 Engineering Neural Networks
3.1 A Brief Recap on Systems Engineering
3.2 The Keystone: SE4AI and AI4SE
3.3 Engineering Complexity
3.4 The Sport System
3.5 Engineering a Sports Club
3.6 Optimization
3.7 An Example of Decision Making
3.8 Futurism and Foresight
3.9 Qualitative to Quantitative
3.10 Fuzzy Thinking
3.11 It Is all in the Tools
3.12 Chapter Summary
Questions
Sources
Part II: Neural Networks in Action
4 Systems Thinking for Software Development
4.1 Programming Languages
4.2 One More Thing: Software Engineering
4.3 Chapter Summary
Questions
Source
5 Practice Makes Perfect
5.1 Example 1: Cosine Function
5.2 Example 2: Corrosion on a Metal Structure
5.3 Example 3: Defining Roles of Athletes
5.4 Example 4: Athlete's Performance
5.5 Example 5: Team Performance
5.6 Example 6: Trend Prediction
5.7 Example 7: Symplex and Game Theory
5.8 Example 8: Sorting Machine for Lego
®
Bricks
Part III: Down to the Basics
6 Input/Output, Hidden Layer and Bias
6.1 Input/Output
6.2 Hidden Layer
6.3 Bias
6.4 Final Remarks
6.5 Chapter Summary
Questions
Source
7 Activation Function
7.1 Types of Activation Functions
7.2 Activation Function Derivatives
7.3 Activation Functions Response to W and b Variables
7.4 Final Remarks
7.5 Chapter Summary
Questions
Source
8 Cost Function, Back-Propagation and Other Iterative Methods
8.1 What Is the Difference between Loss and Cost?
8.2 Training the Neural Network
8.3 Back-Propagation (BP)
8.4 One More Thing: Gradient Method and Conjugate Gradient Method
8.5 One More Thing: Newton's Method
8.6 Chapter Summary
Questions
Sources
9 Conclusions and Future Developments
Glossary and Insights
Index
End User License Agreement
Chapter 2
Table 2.1 Comparison between main characteristics of Biological and Artifici...
Chapter 4
Table 4.1 Python libraries for coding a neural network.
Chapter 5
Table 5.1 Cosine function.
Table 5.2 Cosine function.
Table 5.3 Corrosion classification.
Table 5.4 Corrosion classification.
Table 5.5 Defining roles of athletes.
Table 5.6 Defining roles of athletes.
Table 5.7 Athlete's performance.
Table 5.8 Athlete's performance.
Table 5.9 Team performance.
Table 5.10 Extract of the data with additional columns created.
Table 5.11 Extract of the data (part 1) for EDA process.
Table 5.12 Extract of the data (part 2) for EDA process.
Acknowledgements
Figure 1 www.ai-shed.com.
How to Read this Book
Figure 1 Life cycle model with some of the possible progressions.
Chapter 1
Figure 1.1 The human being becomes the standard for all things.
Figure 1.2 Number, Idea, Human, Uncertainty…what else?
Figure 1.3 Cheetah or Leopard? Could you tell them apart?
Figure 1.4 How many scenarios of a GO game can you imagine?
Figure 1.5 Integration of AI/ANN into an existing system.
Figure 1.6 A new system is met when an AI is introduced.
Figure 1.7 Development and deployment of a new system in the work environmen...
Chapter 2
Figure 2.1 How many branches of knowledge are linked to AI? Here are some ex...
Figure 2.2 Is it or is it not a cat? The Manx cat dilemma.
Figure 2.3 The mystery of a nerve cell.
Figure 2.4 How is a cat seen by a neural network? Using neural networks to c...
Figure 2.5 Components of a perceptron: inputs and weights.
Figure 2.6 A perceptron: two inputs and two weights.
Figure 2.7 What is a hyperplane? Here it is in a two-dimension representatio...
Figure 2.8 Simplified computer vision approach.
Figure 2.9 Three perceptrons: two inputs and six weights.
Figure 2.10 Are three hyperplanes enough to define one or more hyperspaces? ...
Figure 2.11 MLP – Multilayer Perceptron. Conventional sketch.
Figure 2.12 Neurons connection – Conventional sketch.
Figure 2.13 Weights represented in a conventional sketch of Neural Network....
Chapter 3
Figure 3.1 Sports business as a field of application of Systems Engineering ...
Figure 3.2 Where systems engineering and AI meet.
Figure 3.3 Example of System Architecture for an autopilot system.
Figure 3.4 A system seen as a mountain – we have to climb this mountain movi...
Figure 3.5 Architecture of a sports club as a framework.
Figure 3.6 The basics of the System Engineering approach.
Figure 3.7 A typical sports club sub-system.
Figure 3.8 Can this sketch also be applied to another business?
Figure 3.9 Time and Cost in an athlete's performance evaluation.
Figure 3.10 V-diagram for NN in the maintenance support.
Figure 3.11 What would be a set of generic requirements for a corrosion cont...
Figure 3.12 How can we use neural networks to improve inspection reliability...
Figure 3.13 Futurism and Foresight. A summary chart. For more information on...
Figure 3.14 A membership function. Main definitions.
Figure 3.15 Union between two sets.
Figure 3.16 Intersection between two sets.
Figure 3.17 Complement of a set.
Figure 3.18 Difference between two sets.
Figure 3.19 De Morgan's law: union between two complement sets.
Figure 3.20 De Morgan's law: intersection between two complement sets.
Figure 3.21 Fuzzy Logic typical rules.
Figure 3.22 Fuzzification – Rule 1 application.
Figure 3.23 Fuzzification – Rule 2 application.
Figure 3.24 Fuzzification – Rule 3 application.
Figure 3.25 Fuzzification – Rule 4 application.
Figure 3.26 De-fuzzification –
Center of Gravity
(
CoG
) Method.
Figure 3.27 From sub-system architecture to software development.
Chapter 4
Figure 4.1 The challenge of handling big data.
Figure 4.2 NN structure for reference.
Figure 4.3 Model for calculation – Input Layer.
Figure 4.4 Neural networks nitty and gritty.
Figure 4.5 MMULT function (or the equivalent in your language) – Matrix Prod...
Figure 4.6 SUMPRODUCT – Scalar Product.
Figure 4.7 Model for calculation – Hidden Layer.
Figure 4.8 Fill the spreadsheet.
Figure 4.9 Weight, Bias and Output NN in the spreadsheet.
Figure 4.10 MMULT – Matrix Product.
Figure 4.11 Set Objective to Maximize, Minimize or exact Value.
Figure 4.12 Select the variable cells.
Figure 4.13 Define the constraints.
Figure 4.14 Choose the computational method to find the optimal solution.
Figure 4.15 Options of a computational method.
Chapter 5
Figure 5.1 Cosine function.
Figure 5.2 Cosine function – normalizing process.
Figure 5.3 Associated NN for cosine function model.
Figure 5.4 Results – comparison between real and predicted function.
Figure 5.5 An efficient design, but how many maintenance ships do we need to...
Figure 5.6 ANN in the SE maintenance process.
Figure 5.7 A rusty iron bridge image by Jerzy Górecki from Pixabay.
Figure 5.8 Scheme of the training and validation phase in a maintenance proc...
Figure 5.9 Associated NN for corrosion classification.
Figure 5.10 Scheme of the estimation phase in a maintenance process activity...
Figure 5.11 Zone 1 – From surface to inter-granular corrosion.
Figure 5.12 Zone 2 – From surface to inter-granular corrosion.
Figure 5.13 Zone 3 – From surface to inter-granular corrosion.
Figure 5.14 Zone 1 – From surface to inter-granular corrosion.
Figure 5.15 Zone 2 – From surface to inter-granular corrosion.
Figure 5.16 Zone 3 – From surface to inter-granular corrosion.
Figure 5.17 Scatter plot – interceptions.
Figure 5.18 Scatter plot – passes.
Figure 5.19 Scatter plot – distance covered.
Figure 5.20 NN associated with roles classification.
Figure 5.21 Scatter plot – results – comparison between real and classified ...
Figure 5.22 Using neural networks big data to predict match outcomes? We can...
Figure 5.23 Target function – real scores (cumulative).
Figure 5.24 Normalized scores – normalized psyc-physic-tact status.
Figure 5.25 NN associated with athlete's performance prediction.
Figure 5.26 Results – comparison between real and predicted scores.
Figure 5.27 Linear function – real score.
Figure 5.28 Linear function – predicted score.
Figure 5.29 Count total amount of match results.
Figure 5.30 Learning curve: Cross Entropy variation.
Figure 5.31 Learning curves: loss and precision.
Figure 5.32 Confusion matrix: rows (true labels) and columns (predicted labe...
Figure 5.33 Input data set.
Figure 5.34 Daily cumulative affected people and daily increase rate.
Figure 5.35 Affection rate against prediction (dashed line). The pseudo line...
Figure 5.36 Delete the rows and columns associated with the minimum p.
Figure 5.37 Delete the row associated with the minimum p.
Figure 5.38 Best strategy between two competitors.
Figure 5.39 LEGO Sorting Machine (for a link to the video or the real machin...
Figure 5.40 Image by Francis Ray from Pixabay.
Figure 5.41 What an incredible variety of parts and colors for a simple toy....
Figure 5.42 Architecture of machine domain.
Figure 5.43 Image by OpenClipart-Vectors from Pixabay.
Figure 5.44 Real images.
Figure 5.45 Synthetic images.
Chapter 6
Figure 6.1 Simplified representation of the INPUT matrix of a neural network...
Figure 6.2 Simplified representation of the OUTPUT matrix of a neural networ...
Figure 6.3 Simplified representation of the association between output/input...
Figure 6.4 Under fitting, over fitting or balanced? (Creative Commons -Attri...
Figure 6.5 Representation of one or more hidden layers.
Figure 6.6 What influences our choices?.
Figure 6.7 Wx + b: how does a (scalar) bias influence the network's neurons?...
Figure 6.8 Representation of a single Bias associated with the whole hidden ...
Figure 6.9 Bias connection. Conventional sketch.
Figure 6.10 Bias represented in a conventional sketch of Neural Network.
Chapter 7
Figure 7.1 Activation functions: why are they important?
Figure 7.2 Activation function: sigmoid.
Figure 7.3 Activation function: hyperbolic tangent.
Figure 7.4 Activation function: ReLU (Rectified Linear Unit).
Figure 7.5 The importance of knowing the derivative of each activation funct...
Figure 7.6 The importance of knowing the derivative of each activation funct...
Figure 7.7 The importance of knowing the derivative of each activation funct...
Figure 7.8 Sigmoid function under varying neuron temperature (t = temperatur...
Figure 7.9 Hyperbolic tangent function under varying neuron temperature (t =...
Figure 7.10 ReLU function under varying neuron temperature (t = temperature)...
Figure 7.11 Output variation of a neuron based on a varying bias value.
Figure 7.12 Output variation of a neuron based on a varying weight value.
Figure 7.13 Weight and bias values depend on each other.
Figure 7.14 Introducing an activation function. Conventional sketch.
Figure 7.15 NN representation. Conventional sketch.
Chapter 8
Figure 8.1 Example of a function of two variables.
Figure 8.2 Job interview as error optimization algorithm.
Figure 8.3 The network is in auto correct, back-propagation (simplified grap...
Figure 8.4 Neural Network flow against Back Propagation flow.
Figure 8.5 Find the solution by the gradient method.
Figure 8.6 Find the solution by Newton's method.
Chapter 9
Figure 9.1 Can we understand reality purely through mathematical laws? The c...
Figure 9.2 Reasoning Triangle- Continuous Process to reach a conclusion afte...
Figure 9.3 The contents of this book can be brought to its proper conclusion...
Figure 9.4 Can we develop increasingly complex and advanced machines, and an...
Glossary
Figure G.1 Differentiable function.
Cover
Title Page
Copyright
Dedication
About the Authors
Acknowledgements
How to Read this Book
Table of Contents
Begin Reading
Glossary and Insights
Index
End User License Agreement
iii
iv
v
xi
xiii
xv
xvi
1
3
4
5
6
7
8
9
10
11
12
13
14
15
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
149
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
Alessandro Migliaccio
Giovanni Iannone
This edition first published 2023© 2023 John Wiley & Sons, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Alessandro Migliaccio and Giovanni Iannone to be identified as the authors of this work has been asserted in accordance with law.
Registered OfficeJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of WarrantyIn view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
Library of Congress Cataloging-in-Publication Data Applied for:
ISBN - 9781119901990 (hardback)
Cover Design: WileyCover Image: © Peera_stockfoto/Shutterstock
… look at the world through your own eyes and always be self-aware.
ALESSANDRO MIGLIACCIO, CEng, ASEP: graduated in Space Systems Engineering at Delft University of Technology, currently working as a Systems Engineering Development Leader at Airbus. Expert in mixed reality technology and aspiring futurist, with more than 10 years of work experience in aeronautical companies in several countries, he has led data analytics projects aimed at the optimization and tailoring of maintenance programs of airplanes. A Chartered Engineer from the Royal Aeronautical Society and certified systems engineering practitioner, he is passionate about refining his skills by finding new ways to improve team work with new paradigms based on professional fulfillment, nurturing of talents and democratization of niche technologies. To this aim he founded AIShed as an association of voluntary talents dedicated to outreach and research. LEGO builder and drone pilot, he enjoys volunteering work as STEM ambassador and practicing mindfulness.
GIOVANNI IANNONE: graduated in Mechanical Engineering for Design and Manufacturing at “Università degli Studi di Napoli,” and was awarded a Master in Systems Engineering at MS&T (U.S.). Expert in aeronautical structures and continuous airworthiness on large airplanes, with more than 10 years of work experience in Subject Matter Expert (SME) departments. He has taken a critical interest in the decision making process by different mathematical approaches. Member of INCOSE for several years, he presented at ASEC2014 on decision making algorithms in the sports business.
This written work is the result of a year's work and the product of a decade-long professional partnership based on a continuous exchange of ideas. As it is often the case, very few pieces of work come from individual minds. These usually grow in a (neural) network of friends and supporters, through late night conversations and amiable chats by the coffee machine. Eventually the activity has become so large as to require structure, therefore we launched AiShed (Figure 1), a cultural association headquartered in France, devoted to outreach and to fuel a community of likeminded people with an interest in AI.
Firstly, we must thank our families, friends and colleagues - too many are their names for a space this small. A special thank you goes to Federica Migliaccio and Giuliano Iannone for their relentless support of the AiShed association activities and to Angela Masella for the excellent work of translation and editing. A note of gratitude goes to Eberhard Gill, professor of Space Systems Engineering and Director TU Delft Space Institute for introducing Systems Engineering to bright young minds every year, including my own. Furthermore, a dear thank you to Klein Kim for her work on the athletic performance code and Daniel West for allowing us to share his curious and interesting work on the LEGO® sorting machines.
Thank you to all of you.
Figure 1www.ai-shed.com.
This book is not meant to compete with all the research that made it possible for more and more innovative machines to be developed. Our aim is to simply help the reader put into practice the theories presented in the next pages through the useful applications found in the last part of the text. The content of the book revolves around the lifecycle of a generic system (Figure 1) and addresses how Neural Networks can be called into actions in different phases of a system development.
The text is structured in three parts with the first part focused on systems engineering while the second will present a number of exercises through the use of two programming languages, Python and Visual Basic. This will allow readers of different academic backgrounds to interact with neural networks. Part 3 covers the theory of Neural Networks and its key components.
Chapter 3 is particularly interesting, as we will delve further into the link between the theory of Systems Engineering and Analytic Foresight. An innovative approach will be used to show how to apply neural networks to the sports business. A quirky example is the one related to LEGO® sorting machines - as unusual as it may sound, sorting machines are at the basis of industrial engineering, from automotive applications to food technology.
The journey to understanding neural networks is a fascinating one though it can be perceived as arduous to the inexperienced reader. This topic requires an academic knowledge of basic calculus. Let us reach an agreement: in this book we will examine the basics of the topic, assuming that the reader will be proactive in utilizing the resources available on the Internet or in literature to close gaps in understanding.
Figure 1 Life cycle model with some of the possible progressions.
Source: INCOSE Systems Engineering Handbook: A Guide for System Life Cycle Processes and Activities, Forth edition– Wiley.
You can read the book chapter by chapter or by picking a specific chapter of interest. This book is enriched with examples from classic literature and examples from industry to help you grasp the most difficult notions. There is a glossary in “Glossary and Insights” to include other clarifications and references to in-depth studies.
We have also created a website we invite you to visit, hoping to intrigue your curiosity: http://www.ai-shed.com.
We hope you find this an entertaining and useful read!
I see it all perfectly; there are two possible situations – one can either do this or that. My honest opinion and my friendly advice is this: do it or do not do it – you will regret both.
Søren Kierkegaard
From the Ancient Greeks through the Renaissance, and until our present day, human beings have always tried to give meaning to the reality surrounding them. This effort was not based on tradition or myth, but on the human rational ability to describe reality through the laws of mathematics.
When writing a scientific text and trying to give reality a meaning by applying a mathematical model, we cannot ignore certain philosophical concepts. On the contrary, we must find inspiration in the opinions of the great thinkers of the past. We will only examine a few postulates, but you can rest assured that many more are available and extensively explained in the literature. We take advantage of the knowledge that was made available to us by such human talents.
Our calculus teachers would never stop saying that numbers have to be interpreted, understood, and explained organically. Thanks to numbers we can define an object, an event26, or a physical phenomenon. Why is there a need to interpret numbers?
According to Pythagoras, numbers are the primordial elements from which reality is derived. The latter can be inferred through a strict mathematical and geometric sequence. The qualitative and contemplative elements coexist with the quantitative one. Each number is associated with a shape containing elements which allows them to stay together in a harmonized and neat manner. Therefore, if we base our interpretation of the world on its numerical and harmonious nature, we can come to understand it starting from its measurements. Are numbers all we need to understand the world? What is your idea of the reality surrounding us?
It is not easy to have an idea and expand upon it – we could find it difficult, for example, to distinguish true from false and zero from one. Once its traits are defined – zero or one, true or false – an idea is absolute and unalterable, therefore we can associate it to reality.
To paraphrase the words of Plato and inferring his theories from his dialogs – we apologize in advance to our fellow philosophers and teachers of philosophy – the Idea exists outside of our mind. It is detectable only by our intellect and not by our perception, the latter being not sufficient to understand reality. We can simplify by saying that the idea is similar to a standard of judgment. We have access to the knowledge of things only if we have ideas (See Figure 1.1).
Ideas act as measurements to evaluate the tangible reality, and they do not reside in our imagination. The idea, as an objective entity, is not to be confused with opinion, which is instead subjective. As we know in physics and mathematics, we have to measure the phenomena we observe daily with certainty and precision. The idea is a model (or archetype) correlated to our empirical world – we should only try to imitate or duplicate this model. We will see later on how this model acts as an absolute reference for our implementation.
Figure 1.1 The human being becomes the standard for all things.
If we wanted to make a measurement of a particular event and then attribute to that event a meaning, we could start with the concept of an idea, in an absolute sense, as an essential reference. On the other hand, if the measurement is associated with a judgment, then we cannot know whether there is an absolute rule to discriminate that event. Surely we can trace back through experience to the general rule governing an event.
Therefore, as we have mentioned earlier in this chapter, human beings have always felt the need to explain the worldly reality they live in and, might we add, it could not be any other way. Man's senses can be accepted as an important source of knowledge.
However, would we as human beings be able to understand the world based on rules established by our rationality? The Vitruvian Man, famous work by Leonardo da Vinci, conveys a model of a human body that is analyzed and measured through mathematical and geometric tools. The human being becomes the standard for all things, therefore humanity as a whole gains full awareness. Man2, put at the center of the world, becomes the symbol of a better future.
We now have all the elements to start writing about mathematical models, which can also be referred to as rational and variable structures, integrated in logical processes. These structures are based on the concepts of Number, Idea and Human, which we have introduced earlier.
The ultimate meaning of an event is seen as a reliable reference, and we aim toward it. A systematic and methodical approach to the analysis of an event – as we will see in Chapter 3 – will help us interpret it. Its modeling can take us to more reliable, though not absolute, conclusions.
When applied to varied events in our lives, the use of mathematical models can help the reader to better interpret certain dynamics that are part of our daily life. The examples in this book are relevant to those aspects that are often difficult to decipher due to their complex nature. Processes such as decisional ones can be understood via computational models used in the examples provided.
The German physicist W.K. Heisenberg affirmed that concepts of probability apply to all cognitive processes. Human beings would not be able to reach a perfect understanding of a physical phenomenon because the observer is not able to determine how they are interfering with the observed object.
To build a mathematical model we need to define a prerequisite that establishes its efficacy: this is uncertainty (Figure 1.2). Uncertainty obviously plays a vital role in the development of complex models, as they require a reliable mathematical form to describe a real problem. We would be ignoring our reality, and our understanding of it, if we ignored the element of uncertainty. It is easily understandable that closed-form expressions offer a description of simple and less uncertain realities, which is more accurate.
But what if reality is instead complex, uncertain, and difficult to interpret?
Figure 1.2 Number, Idea, Human, Uncertainty…what else?
Now, this is our objective: let us develop a tool that operates through estimates and aims at minimizing the possibility of error so that we can rely on a result as close to objectivity as possible.
How can we achieve that?
Below are two images, at first glance remarkably similar, depicting two different animals: a cheetah and a leopard (Figure 1.3). Would you be able to tell them apart and say which is which? How can we define the two animals if we do not have an extensive knowledge of zoology? The obvious solution, not to be dismissed from the start, would be to ask an expert and have him explain to us how to tell them apart. We can obtain an accurate solution to the problem if we make use of specific knowledge to define a series of characteristics. The solution, as a result of data processing, will be more accurate if we can count on all the applicable variables of the problem and if we can relate these to the characteristics of the animal depicted below.
Figure 1.3 Cheetah or Leopard? Could you tell them apart?
Source: Image by Jonathan Reichel from Pixabay.
The approach we take in this book is a common application of linear and non-linear algebraic combinations – these somehow describe the various interactions happening in our brain when it is prompted to find a solution to a problem. The individual outcome of these efforts can be defined as the union of elements and variables working together to perform a specific function.
The reader will find a brief description of neural networks, as a calculation methodology, and some cases, taking the opportunity at the same time to exemplify the “system approach” used throughout the book. It is important to clarify that the learning phase is essential for the neural network to work as expected. A network can detect the trends in the data regardless of how they fluctuate, based on the exact behavior of all the involved variables. The scope of machine learning3 as a discipline is to find a correlation between historic data and present (and future) data; when the correlation is found, the network detects it and uses it as the basis of its prediction32.
Even a distracted reader can see that predictions are valid only if future data trends align with past ones. In other words, if we think of an unchangeable, static and frozen reality, we could set in stone all the algorithms that work successfully. Our reality is constantly changing, but we have to start somewhere – won't you agree? Therefore, let us build the basis of our models and then create an algorithm that can guarantee a certain degree of accuracy in analyzing specific dynamics, and also reduce uncertainty34 as much as possible. In fact, the continuous changes of events can be managed by adapting and improving the algorithms.
Let us take the human brain as a reference to explain this better – after all, we have always been told that our brain is the biggest computer ever created.
Pieces of information move inside a neural circuit of two or more neurons, thanks to a communication process based on chemicals and electrical impulses that is repeated billions of times. Any piece of information in our brain travels at extremely high speed until it reaches the cerebral cortex where information is analyzed and ultimately “understood.”
Before reaching the cortex, the information will go through a high number of synapses. As the same process is repeated time and time again, information will follow the same well-known route. Let us recap: as pieces of information go through the same process a number of times, synapses become so used to it that even different, but relevant, signals ignite the same sequence of impulses. The entire process of memorization is part of a “simplified” process that, starting always from the same point, travels easily through the same steps again and again.
The neural networks' learning process, as presented in this book, follows the general principles described above, though the process happening in our brain is way more complex than the one we will use for our artificial neural networks. Our brain is constantly reshaping, and old synapses are replaced by new ones when our brain goes through learning or memorization phases. On the contrary, the networks used in this book will remain quasi-static46 once they are set up and will be used to process the same type of information.
We hope that the notion of “learning” is now established as the basis of the implementation of all neural network algorithms. Once all scenarios and conditions are memorized, we can proceed and estimate ways to learn the difference between right and wrong. Moreover, it is possible to only learn from pieces of information that are not decoded or limit our research to the correlation present among the available data. It is also possible to learn from known data and make predictions or decisions by comparing all the available options once the data is processed. If we think that reality is way more complex, the choice of learning and processing data with linear models might be oversimplifying. Our aim is to uncover the correlation that exists among the known data, or rather the physics and mathematical models at the basis of all the possible scenarios that the network needs to crunch.
We have noticed how cognitive science and neuroscience have come closer to each other in the last millennium. The technological advancement and research made by physicists, psychologists, and neurophysiology experts on the study of the brain enable us to develop more complex neural networks (Figure 1.4).
We would like the reader to focus on the concept of learning once more, and to try and optimize it so as to distinguish it from other concepts or processes. How can we improve the learning phase? Can we just increase the number of data or improve the quality of the data itself? By adding structure to the network, we would be increasing the complexity of the code and the processing time.
Figure 1.4 How many scenarios of a GO game can you imagine?
Source: Image by Jonathan Reichel from Pixabay.
It is known that when we focus too much on the details of a simple problem, we lose sight of the context, and we can easily find ourselves with a foggy solution. Equally, if we had access to high processing capability but insufficient training data, we would need to steer the learning process toward a specific solution, rather than a generic one.