63,09 €
Artificial Intelligence (AI) is an interdisciplinary science with multiple approaches to solve a problem. Advancements in machine learning (ML) and deep learning are creating a paradigm shift in virtually every tech industry sector.
This handbook provides a quick introduction to concepts in AI and ML. The sequence of the book contents has been set in a way to make it easy for students and teachers to understand relevant concepts with a practical orientation. This book starts with an introduction to AI/ML and its applications. Subsequent chapters cover predictions using ML, and focused information about AI/ML algorithms for different industries (health care, agriculture, autonomous driving, image classification and segmentation, SEO, smart gadgets and security). Each industry use-case demonstrates a specific aspect of AI/ML techniques that can be used to create pipelines for technical solutions such as data processing, object detection, classification and more.
Additional features of the book include a summary and references in every chapter, and several full-color images to visualize concepts for easy understanding. It is an ideal handbook for both students and instructors in undergraduate level courses in artificial intelligence, data science, engineering and computer science who are required to understand AI/ML in a practical context.
Audience
Students and instructors in artificial intelligence, data science, engineering and computer science courses
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 427
Veröffentlichungsjahr: 2000
This is an agreement between you and Bentham Science Publishers Ltd. Please read this License Agreement carefully before using the book/echapter/ejournal (“Work”). Your use of the Work constitutes your agreement to the terms and conditions set forth in this License Agreement. If you do not agree to these terms and conditions then you should not use the Work.
Bentham Science Publishers agrees to grant you a non-exclusive, non-transferable limited license to use the Work subject to and in accordance with the following terms and conditions. This License Agreement is for non-library, personal use only. For a library / institutional / multi user license in respect of the Work, please contact: [email protected].
Bentham Science Publishers does not guarantee that the information in the Work is error-free, or warrant that it will meet your requirements or that access to the Work will be uninterrupted or error-free. The Work is provided "as is" without warranty of any kind, either express or implied or statutory, including, without limitation, implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the results and performance of the Work is assumed by you. No responsibility is assumed by Bentham Science Publishers, its staff, editors and/or authors for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products instruction, advertisements or ideas contained in the Work.
In no event will Bentham Science Publishers, its staff, editors and/or authors, be liable for any damages, including, without limitation, special, incidental and/or consequential damages and/or damages for lost data and/or profits arising out of (whether directly or indirectly) the use or inability to use the Work. The entire liability of Bentham Science Publishers shall be limited to the amount actually paid by you for the Work.
Bentham Science Publishers Pte. Ltd. 80 Robinson Road #02-00 Singapore 068898 Singapore Email: [email protected]
Artificial intelligence (AI) is a wide-ranging branch of computer science concerned with building smart machines capable of performing tasks that typically require human intelligence. AI is an interdisciplinary science with multiple approaches, but advancements in machine learning and deep Learning are creating a paradigm shift in virtually every tech industry sector. Coming to machine learning, an in-built branch of AI, has changed the shape of the emerging world…!! It’s not a mere exaggeration to say that ML has taken living abilities and standards a step forward. Machine learning is an application of artificial intelligence (AI) that allows systems to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it to learn for themselves. The process of Learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow computers to learn automatically without human intervention or assistance and adjust actions accordingly.
This book covers the fundamental concepts and techniques in Artificial Intelligence, Machine Learning and Soft Computing in detail. The main focus is on real-time applications of AI, ML and Soft computing. As AICTE introduced emerging technology courses under UG Program, authors realized that there are no proper textbooks to teach students with practical orientation. The sequence of the book has been set in an appropriate manner to make it easy for students and teachers to understand concepts. The book provides insights into AI, ML and Soft computing and their applications in core areas like Agriculture, Smart Cities, Environment, etc., which impact society and their vital role in the development of the nation.
Machine Learning has become part of our daily life directly or indirectly. Several Machine Learning techniques are being used in several areas to increase the effective usage of computers by human beings. Over the past few decades, the concepts of AI and ML have rapidly increased their importance in various application areas, and a lot of research has taken place and is still in progress. Even then, there is a lot more to explore regarding the applications of AI and ML in our day-to-day lives. The reason to do so is only to improve the existing process of work in various areas and also to give valuable suggestions and scope for further research. In this context, here we tried to discuss basic concepts related to machine learning and gave a brief overview of various application areas of machine learning.
Artificial Intelligence (AI), the most prominent research area of computer science is basically concerned with the development of machines that can think and work similarly to human beings. AI is an interdisciplinary research area with many sub-branches like Artificial Neural Networks (ANN), Machine Learning (ML), etc. ML, which is said to be a sub-division of AI, is based on the concept of developing machines that can learn from data and formulate patterns from it with very less interference from human beings. Basically ML allows machines to learn from data and improve themselves to solve tasks.
In the year 1950, Alan Turing developed the Turing test. The objective of conducting the Turing test was to check whether a machine can think like a human or not. Even though this was a basic level test, but it later had a huge impact on AI [1, 2].
The Turing test [3, 4] can be defined as a simple question-and-answer game played between a human and a machine. The person who asks questions is also a machine. Now the task of the machine is to check and analyze the answers given, and analyze whether the machine is human or not. Even though Alan Turing foresaw that by the 21st century, machines will be able to think and act like humans, it didn’t happen completely like now. For example, if we consider any kind of Interactive Voice Response System (IVRS) used in mobiles or ATMs etc., we can easily identify it as a machine but not a human. That is how we could clearly differentiate between a machine and a human.
In the year 1952 [5, 6], Arthur Samuel developed the first computer program that was able to learn. The program was the game of checkers, and the IBM computer improved at the game, the more it played, studying which moves made up winning strategies and involving that list of moves inside its program.
In 1957 [7], the first neural network for computers was designed by Frank Rosenblatt to simulate the thought process of the human brain.
In the year 1960 [8, 13], MIT developed a Natural Language Processing program to act as a therapist. The program was called ELIZA, and it was a successful experiment. But it was still using scripting to do its work. It was a key milestone for the development of a subset of Machine Learning called Natural Language Processing (NLP) which is being widely used even today.
In the year 1967 [9, 11, 12], the “nearest neighbor” algorithm was written, which allows computers to start using very basic pattern recognition. Later it was used to form a route map for traveling salesmen, who starts at a random city but ensure they visit all cities in the tour K-Nearest Neighbor or KNN is one of the important and famous Machine Learning algorithms.
In the year 1970 [10, 13], the concept of Backpropagation was developed. Back Propagation is nothing but a set of algorithms used in Deep Learning. They modify deep learning neural networks in such a way that they can correct themselves.
In the year 1979 [12], the students of Stanford University developed a “Stanford cart” that has the ability to navigate the obstacles within a particular room on its own.
In the year 1980 [13], Artificial Neural Network (ANN), which is a multilayered neural network, was developed by Kunihiko Fukushima.
In the year 1981 [12], the concept of Explanation Based Learning (EBL) was introduced by Gerald Dejong. EBL enables a computer system to analyze the given data and formulate a rule that can be followed by discarding data that is not so important. This was a very early stage of developing the concept of a Machine Learning Algorithm.
In the year 1985 [12], Nettalk was invented by Terry Sejnowski and Charles Rosenberg. NETtalk is an automated procedure that was developed to learn how to pronounce the English text written, when text is shown as input and matching phonetic transcriptions for comparison. The purpose of creating NETtalk was to identify the methods of correctly learning to pronounce the text in English.
In the year 1989 [13], the Reinforcement Learning concept was developed. In reinforcement learning, artificial intelligence faces a situation similar to that of a game where it learns to achieve a goal in a potentially complex and uncertain environment. To make the machine to act according to the needs of the programmer, the artificial intelligence gets either a benefit or a fine for the actions it performs. It aims to maximize the total benefit. Here, the programmer will specify game rules but won’t give any clue to the machine regarding the rules of the game or the procedure for solving the game. It is up to the machine to decide and formulate how to win the game by following the trial and error method, thereby increasing its own rewards.
In the early 1990’s [12], the work on Machine Learning shifted to a “data-driven approach” from a “knowledge-driven approach”. From here, experts have started developing computer programs, and those computers must analyze or gain knowledge from that data.
In 1995 [13, 14], Random Forest Algorithm and Support Vector Machines algorithm were developed. These are considered to be some of the most important Machine Learning algorithms.
In 1997/98 [13, 15], Sepp Hochreiter and Jürgen Schmidhuber introduced LSTM. Long Short-Term Memory (LSTM) [16] networks are a type of recurrent neural network capable of learning order dependence in sequence prediction problems. This is a required behavior in domains like machine translation, speech recognition, etc. A recurrent neural network (RNN) [17] is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. A feed-forward neural network [18] is an ANN where the connections between the nodes are acyclic. In this network, the movement of information is unidirectional, which means it moves in the forward direction from the input nodes, through the hidden nodes and then to the output nodes. The network is acyclic.
In the year 2006 [12], Geoffrey Hinton used the Deep Learning concept that allows computers to see and differentiate data that is in the form of videos and images.
In the year 2011 [12], Google Brain was developed, whose deep neural network is capable of identifying and classifying objects similar to that of a cat.
In the year 2012 [12], Google’s X Lab developed an algorithm that can browse videos on YouTube that contain cats in them.
In the year 2014 [12], Facebook developed an algorithm called “Deep Face” that is capable of identifying human beings in photos uploaded on Facebook just like humans do.
In the year 2015 [12], Amazon started its own platform for Machine Learning. Stephen Hawking, Elon Musk, and Steve Wozniak have signed an open letter and warned about the danger that can be caused by automatic weapons. They have quoted that these automatic weapons can select and hit the target without any order or command from human beings.
In the year 2016 [12], Google’s artificial intelligence algorithm beat a professional player in the Chinese board game Go, which is considered the world’s most complex board game and is many times harder than chess. Deep Mind, which was the algorithm created by Google, won five games out of five in the Go competition.
Even after all these developments in the areas of AI and ML over the past few decades, there are still some experts around the globe who believes that a computer can never think and act like a human. Irrespective of whether their opinion is true or false, the computer’s capability to see, identify, verify and analyze is increasing at a quick rate. Also, globalization has become one of the main reasons for the generation of large amounts of data every day in various aspects, so there are a lot of scopes for computers to read, analyze and learn from the data.
Some of the real-time application areas of ML are:
Machine learning can translate speech into text. Some software applications can convert the live voice or a speech which is recorded into a text file. Ex: Alexa.
Machine learning can classify available data into groups, which are then defined by rules set by analysts. When the classification is completed, the probability of the occurrence of a fault can be calculated by analysts. Predictive analytics is one of the most promising application areas of machine learning. Predictive analytics is applicable everywhere and for everything, right from the estimation of real estate prices to the development of products.
While shopping on e-commerce brands like Amazon, Flipkart, Myntra etc., you get to see recommended items or options like ‘users who bought this product also bought’, ‘users also buy this along with this product’!. Also, while browsing YouTube, we see similar kind of suggestions regarding the videos we browse. The ML methods will learn the patterns from the purchase and the browsing history of users and then make suggestions according to it.
Image recognition is a well-known and widely mentioned example of machine learning in the real world. It is the concept of identifying any object as an image that is in digital form, depending on the intensity of the pixels. Ex: X-ray scans, Tagging people on social media, Phone unlocking using face etc.
This is one of the most advanced applications of machine learning and AI. Videos always provide us with an efficient opportunity of extracting useful information from automatic surveillance devices. This is only possible because machines keep a better outlook on objects which are used to compared to human minds. Surveillance footages are the best machine-learning dataset because of its accuracy.
Machine learning can extract structured information from unstructured data. Organizations gather together large volumes of data from customers. ML technique will automate this process of annotation of datasets for predictive analytics tools.
How does Google Maps know that you are on the fastest route despite the traffic is high? When we use Google Maps to find the route to a particular place, we give permission to Google Maps to use the following data of ours like location, the day and time we are using Google Maps etc.; this type of information is gathered and saved by the application. Machine Learning algorithms will utilize this kind of information to properly instruct us to travel from our location to the location we want to.
Machine learning can help with the diagnosis of diseases. Many physicians use chatbots with speech recognition capabilities to recognize patterns in symptoms.
Based on sentiment analysis, machines are able to analyze sentiments on words. The words which are used in posts are classified into negative, neutral and positive words using ML algorithms. Companies are interacting with customers to use this model to increase efficiency.
Google’s GNMT(Google Neural Machine Translation) is working with thousands of languages and dictionaries, to provide the most accurate translation of any sentence or word based on Natural Language Processing. This feature is particularly useful whenever we visit a new place, and we don’t know anyone over there.
Siri, Alexa, Google Home etc., are some of the popular examples of virtual personal assistants. As the name indicates, they help users in finding data, immediately after asking for it through our voice. Machine learning helps these personal assistants to organize the information on the basis of past data. Later, this data is used to respond to the instructions given by the users [21, 22].
As said above, ML has become a part of our day-to-day lives and shows its capability of helping human beings.
Properly understand the problem, and try to give a solution with a correct program by passing the proper input, which gives the desired output. This is the concept of traditional programming.
ML can automate many tasks on a routine basis for an individual or an organization. It also helps in automating and quickly creating models for data analysis.
AI makes computers think and solve tasks like human beings. ML depends on the input data given by the user, and it makes the machine learn from the external environment like sensors, storage devices etc.
1. Data Gathering
2. Data Preparation
3. Selecting the Model
4. Training the Model
5. Evaluating the Model
6. Hyperparameter Tuning
7. Predicting
The accuracy and efficiency of the model depend on the quantity and quality of data that we collect, and will be used for training the model.
It will prepare the data for training. The term preparing means cleaning the data i.e., discarding duplicate data, identifying and filling the values that are missing, handling the errors in the data, normalizing the data, and performing data (type) conversions wherever necessary. Finally, categorizing the entire data into training sets and evaluation sets.
Generally different algorithms are used to solve different tasks. So, depending on the type of task the relevant algorithm has to be chosen.
It is used to analyze the data and formulate the appropriate pattern or to give proper answer to the question given. Every iteration in this process is considered a training step.
The performance of the model is checked using the ‘unseen data’. The term ‘unseen data’ means a subset of the dataset used to analyze the probable future performance of a model. A dataset means a collection of data or the table of a database.
The same machine-learning model may require different constraints, weights or learning rates to generalize different data patterns. These measures are called hyperparameters, which are prepared for the algorithm to solve the given problem in the best possible way [25].
Based on the test data, we can analyze the possible future performance of a model in the real world.
Different types of Machine Learning are:
Supervised learningUnsupervised learningReinforcement learningSupervised learning [26] means the ML algorithm is given a dataset that is labeled, and the algorithm is expected to learn from that data by identifying and specifying the relationships between variables present in the dataset given. Improvement of supervised machine learning algorithms is a continuous process that will continue to find new patterns and relationships as it trains itself on new data sets.
Unsupervised learning [26] means allowing the ML algorithm to learn from data that is not labeled. Here no human efforts are required to make the dataset machine-readable. Algorithms are expected to learn from data that has no labels; these algorithms are generally versatile. Unsupervised learning algorithms can adapt the data by dynamically changing the behavior of the dataset which is provided.
Reinforcement learning [26] is on the basis of human learning strategy; it learns from data and applies the trial-and-error method to every task. Once performing the given task, the algorithm has to interpret whether the outcome is correct or not. If the output is not proper or not as per expectations then the algorithm must reiterate till it gets a correct result.
A machine learning algorithm can also be referred to as a model. It has a mathematical expression that represents the data from a business point of view. The goal is to figure out how likely things are to happen in the future by moving from facts to insights. For example, an online store might be able to predict quarterly sales with the help of a machine-learning algorithm that has been trained on past data.
Some of the important ML methods are:
Regression is a supervised learning technique that helps to estimate or analyse the numerical value based on previous data. For example, the cost of air conditioners for next summer can be estimated based on their cost this summer and the previous summer.
Classification is another supervised learning technique that explains or forecasts the value of a class. For example, they can help you predict whether an online customer will purchase a particular product or not. In this kind of case, the answer is either yes or no. Apart from this classification, methods can generate all possible outputs for a prediction. For example, this algorithm can help you check whether a given image has a cat or mouse. In this case, we can get three possible outputs. They are in the image; it can be a 1. Cat 2. Mouse or 3. None of them. One of the classification algorithms is logistic regression. Logistic regression predicts the probability of the occurrence of an event based on the inputs given. For example, based on the entrance exam rank of a student, this algorithm can predict whether that student gets admission to a particular college or not. Since the possible values of probability are either 0 or 1, if the value is 0, the student will not get admission; if the value is 1 or greater than 0.5, the student will get admission.
Clustering is an unsupervised learning technique where a group or cluster is formed for a set of elements with the same characteristics. Here we can use visualisations to check the quality of the cluster. K-means clustering is a clustering algorithm. Briefly, K-means clustering works like this: K is the number of clusters the user chooses to create. The “K” centres are chosen at random from the data. A data point is assigned to the closest of those randomly created centres. Then, the centre of each cluster is calculated again. If centres change very little or don’t change at all, then the process is said to be completed; otherwise, another data point is assigned to the closest of the centres that were created randomly.
Dimensionality Reduction is used to remove the information that is not so important from the data set given. Principal Component Analysis (PCA) is a dimensionality reduction algorithm that is capable of decreasing the dimension of data without losing too much information.
Ensemble Method is a supervised learning technique that combines several predictive models to get higher-quality predictions. If we want to build a computer, we search for the best manufacturer of the respective computer peripherals. Once we assemble them, they will definitely have better performance than other computers. Ensemble methods work on the same idea. The random-forest algorithm is an ensemble method. This algorithm combines different sets of data into decision trees.
Neural Networks and Deep Learning The aim of neural networks is to get data that is in a non-linear pattern by adding parameter layers to a model. A neural network is said to be a deep neural network if it has many hidden layers inside it. A lot of data is needed by a deep neural network. For the classification of images, audio, video, etc., deep learning techniques are used. Pytorch and Tensorflow are software packages related to deep learning.
Transfer Learning is re-using a part of a previously trained neural network and adapting it to a new but similar kind of task. Only less data is needed to train the neural network.
Natural Language Processing is not an ML technique. It is a technique used to prepare data for machine learning. Researchers at Stanford created NLTK. NLTK stands for Natural Language Toolkit. NLTK is used to filter the data, which comes in different formats like Word, PDF, ppt, etc. To map to a numerical representation, TFM can be used. TFM stands for Term Frequency Matrix. In TFM, each row represents a text document, and each column represents a word. The frequency of each word within a document is specified in TFM. From TFM, TFIDF can be developed. TFIDF is the short form of Term Frequency Inverse Document Frequency. TFIDF holds information about how important each word is in a set of documents.
Word Embeddings captures the context of a word in a document. Word context quantifies the similarity between words, thereby allowing us to do arithmetic in words. We calculate how many words are combined. For this purpose, we use machine learning techniques. But it is only an initial step to apply a machine learning algorithm. For example, consider that we have access to the tweets of several people who use Twitter. We can also assume that we know which of these users purchased a new house. To predict the probability of a new Twitter user purchasing a new house, we can use a combination of Word2Vec and logistic regression. Word2Vec is a mechanism whose working concept is based on neural networks that map words in a corpus to a numerical vector. We can then use these vectors to find similar words, perform mathematical operations on those words, or even use them to represent text documents.
Konstantinos G. Liakos [4] has analyzed a new ML that can be useful for agriculture. The author has reviewed around 40 articles published in different journals. Out of all these articles, most of them were related to how ML can be applied in crop management; after analyzing all these articles by the author, it was found that a total of 8 ML models were implemented. Out of these 8, 5 ML models were implemented in crop management. All these models were implemented in various agriculture domains like Animal welfare, livestock production, crop quality, water management, soil management etc. Also, a diagrammatic representation of ML techniques that can be applied to agriculture in the future for each subcategory is given. From the study made by the author it is found that models of ML were applied in several applications for crop management (61%), disease detection (22%), and yield prediction (23%). Also, it is specified that by applying ML to sensor data, farm management gradually evolves in AI systems and moves into an era of knowledge-based agriculture.
Jaspreet Singh [6] exploited 4 ML classifiers for sentiment analysis. The author used 3 manually annotated datasets for this purpose. Performance evaluation of the 4 classifiers, i.e., Native Bayes, J-48, BF Tree, and OneR is given in a tabular form. Also, another table that shows the test accuracies of these 4 algorithms for these 3 datasets is given. Freeware WEKA software tool is used for student’s classification in the text.
Vatsal H. Shah [8] employed a few ML techniques in stock prediction and explained how stock prices increase and decrease at trading time. This paper focuses on applying SVM, linear regression, Expert Weighting, online learning and prediction using decision stumps, along with the merits and demerits of each method. This paper has introduced the variables and parameters that are useful in pattern recognition in stock prices that can be useful for future stock prediction and also discussed how boosting can be used along with other ML algorithms to improve the accuracy of such prediction systems. This paper briefly provides an overview of the financial prediction system domain. The evaluations of generated results after applying ML algorithms are discussed.
Shahadat Uddin [7] has made an attempt to study the comparative study of various supervised ML algorithms that can be used for disease prediction. Since there will be variation in disease prediction between research data seta and clinical data, the author used the same data on which ML techniques were applied for disease prediction and to compare them. One limitation of this work is that no sub-classification of supervised ML algorithms was compared instead, only broad categories were considered for the study. In the study, a table showing the advantages and drawbacks of supervised ML algorithms was given. One most useful information in this paper is a table showing the names, and references of diseases and the corresponding supervised ML algorithms used for prediction is given. Also, out of all the used algorithms, the name of the better algorithm was also mentioned. Above mentioned table provides a summary of 48 articles published that are related to disease prediction using supervised ML techniques.
Seema Sharma [1] explained ML techniques that are used in the field of data mining. The author discussed applications of some of the techniques like Nearest neighbor, SVM, Bayesian Networks, Decision trees etc., in data mining. Before that, the basic concepts related to these techniques discussed. According to the author, it is always useful to apply more than one technique rather than applying any single technique. In the paper the author has differentiated among the above-listed 4 algorithms in terms of average accuracy, tolerance noise, speed discrimination and also whether they are generative or discriminative. The author also used the MATLAB classification toolbox in order to show the classification accuracy among the algorithms by using data sets. In the related work section of this paper, the author gave several references about applications of various ML techniques in data mining.
Yazeed Zoabi [9] discussed the development of prediction models that combine various features to analyze the risk of infection. The Aim of this work is to assist the staff of the medical field worldwide to decide the order of treatment, particularly in case of availability of limited resources of healthcare. An ML approach was established, which was trained on data from 51,831 individuals who were tested. From these people, 4769 have got tested positive for covid-19. The data on which the machine was trained also had data of 47,401 tested individuals of a subsequent week, out of whom 3624 were positive. The model specified in this article gave accurate results using 8 binary features only. They are: Gender, Age greater than or equal to 60 years, who had confirmed contact with the tested individual and appearance of 5 initial symptoms. This work was done based on data that was publicly reported by the Health Ministry of Israel. The model developed detects covid-19 by asking some basic questions, and according to the author, the framework of this model is specifically useful where there are limited resources for testing people.
Kamran Shaukat [5] has made a detailed review that specified how quickly the interest in ML and Cyber Security is growing in academics and as well as in industry. In this paper, an effort was made to bridge the gap between techniques of ML and threats to computer networks by providing a survey of crossovers of these 2 areas. The survey made in this paper depicts the ML techniques that can be used for intrusion detection, malware detection, and spam detection in mobile devices and computer networks. This paper mentioned the applications of ML in the field of cyber security, especially in the last decade. In this paper, a graphical summary of cyber security attacks and existing ML techniques to combat them are represented. In this paper, an overview of present challenges in using ML techniques for cyber security was given. Also, a summary and comparison of: i) ML models in detecting malware over the last ten years, ii) ML models for detecting Spam over the last 10 years iii) ML models over 10 years for intrusion detection. The time complexity, description and limitations of frequently used ML techniques are given in a tabular form. A tabular form that represents the similarities and dissimilarities between deep learning and traditional ML is also given. Overall, this paper covers the following: The evolution of cyber security and ML over the last 10 years, basic cyber security attacks and threats, security data sets that are most commonly used, criteria and metrics to evaluate those data sets, basic concepts related to deep learning and ML, the current state of detecting spam/intrusion/malware both in computer networks and mobile devices and challenges faced by ML to handle cyber security.
In the study done by Umer Ahmed Butt [3], the analysis was made on threats, attacks, and challenges faced by cloud computing. Also, it was discussed how various ML algorithms like SVM, Naïve Bayes, K-means etc., can be explored as solutions for security issues of cloud computing. A review was done on some ML algorithms used for cloud security; also, their advantages and disadvantages were highlighted. The scope of research in this area was also mentioned. Joe Fiala [10] looked at different possible ways to use ML to improve cloud computing. Hence, ML techniques were looked at as a way to manage resources dynamically, with a specific focus on predicting the size of VM. Also, in this paper, ML techniques were looked at using resources efficiently, specifically focusing on turning off idle machines and workload consolidation. The basic conclusion of this paper was even though multiple ML techniques are available as options to do the required task, choosing a specific ML technique is totally system dependent. Finally, the paper concluded with a brief description of cloud security. Kyriakos N. Agavanakis [11] has explored the abilities and practical aspects of ML. A practical cloud-based solution was built which provides a suitable environment in terms of scalable infrastructure to share knowledge without disclosing the data on their own or risking the IPR of researchers. It is also specified that the integrity of data may be safeguarded by digitally signing the data. The cloud-based approach specified will support the direct usage of third-party models through a reusable set of services which can be integrated with the applications of end users. Ex: websites, mobile apps, social media apps etc.
Kumar Rahul [2] briefly discussed the concepts of ML that are applicable to various applications based on big data. In this paper, the author has mainly focused on how ML is applied in BDA, particularly in areas of healthcare, production and material segment. Also, this paper focuses on open issues in applying ML to industrial applications, which can be considered for the purpose of research. According to the author, retail, travel, healthcare, finance and media are some of the industries where ML can be used extensively. Identifying diseases, helping the discovery of drugs, detecting frauds, recommending suitable products etc., are some of the ML applications in these industries. Martín Noguerol has explained about strengths, weaknesses, and opportunities of ML applications. In the paper, various terms of ML-like NLP, speech recognition, supervised learning, unsupervised learning etc., In this paper, the strength of ML was mentioned in the context of image-based diagnosis, automatic lesion detection etc., Requirement of a lot of data sets for training was specified as a weakness of ML. Athmaja S has made a survey on compatible ML concepts for BDA. This paper also discusses the prominence of ML applications based on BDA.
Being one of the ML techniques the aim of recommender systems is to provide the user with relevant suggestions. The suggestions given for watching videos on YouTube, the news/search suggestions given by Google, the friend suggestions given by Facebook, the follow suggestions given by Instagram, product suggestions we get for purchasing in e-commerce websites etc