44,39 €
Java is one of the main languages used by practicing data scientists; much of the Hadoop ecosystem is Java-based, and it is certainly the language that most production systems in Data Science are written in. If you know Java, Mastering Machine Learning with Java is your next step on the path to becoming an advanced practitioner in Data Science.
This book aims to introduce you to an array of advanced techniques in machine learning, including classification, clustering, anomaly detection, stream learning, active learning, semi-supervised learning, probabilistic graph modeling, text mining, deep learning, and big data batch and stream machine learning. Accompanying each chapter are illustrative examples and real-world case studies that show how to apply the newly learned techniques using sound methodologies and the best Java-based tools available today.
On completing this book, you will have an understanding of the tools and techniques for building powerful machine learning models to solve data science problems in just about any domain.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 656
Veröffentlichungsjahr: 2017
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: June 2017
Production reference: 1290617
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-051-3
www.packtpub.com
Authors
Uday Kamath
Krishna Choppella
Reviewer
Samir Sahli
Prashant Verma
Commissioning Editor
Veena Pagare
Acquisition Editor
Divya Poojari
Content Development Editor
Mayur Pawanikar
Technical Editor
Vivek Arora
Copy Editors
Vikrant Phadkay
Safis Editing
Project Coordinator
Nidhi Joshi
Proofreaders
Safis Editing
Indexer
Francy Puthiry
Graphics
Tania Dutta
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta
Dr. Uday Kamath is a volcano of ideas. Every time he walked into my office, we had fruitful and animated discussions. I have been a professor of computer science at George Mason University (GMU) for 15 years, specializing in machine learning and data mining. I have known Uday for five years, first as a student in my data mining class, then as a colleague and co-author of papers and projects on large-scale machine learning. While a chief data scientist at BAE Systems Applied Intelligence, Uday earned his PhD in evolutionary computation and machine learning. As if having two high-demand jobs was not enough, Uday was unusually prolific, publishing extensively with four different people in the computer science faculty during his tenure at GMU, something you don't see very often. Given this pedigree, I am not surprised that less than four years since Uday's graduation with a PhD, I am writing the foreword for his book on mastering advanced machine learning techniques with Java. Uday's thirst for new stimulating challenges has struck again, resulting in this terrific book you now have in your hands.
This book is the product of his deep interest and knowledge in sound and well-grounded theory, and at the same time his keen grasp of the practical feasibility of proposed methodologies. Several books on machine learning and data analytics exist, but Uday's book closes a substantial gap—the one between theory and practice. It offers a comprehensive and systematic analysis of classic and advanced learning techniques, with a focus on their advantages and limitations, practical use and implementations. This book is a precious resource for practitioners of data science and analytics, as well as for undergraduate and graduate students keen to master practical and efficient implementations of machine learning techniques.
The book covers the classic techniques of machine learning, such as classification, clustering, dimensionality reduction, anomaly detection, semi-supervised learning, and active learning. It also covers advanced and recent topics, including learning with stream data, deep learning, and the challenges of learning with big data. Each chapter is dedicated to a topic and includes an illustrative case study, which covers state-of-the-art Java-based tools and software, and the entire knowledge discovery cycle: data collection, experimental design, modeling, results, and evaluation. Each chapter is self-contained, providing great flexibility of usage. The accompanying website provides the source code and data. This is truly a gem for both students and data analytics practitioners, who can experiment first-hand with the methods just learned or deepen their understanding of the methods by applying them to real-world scenarios.
As I was reading the various chapters of the book, I was reminded of the enthusiasm Uday has for learning and knowledge. He communicates the concepts described in the book with clarity and with the same passion. I am positive that you, as a reader, will feel the same. I will certainly keep this book as a personal resource for the courses I teach, and strongly recommend it to my students.
Dr. Carlotta Domeniconi
Associate Professor of Computer Science, George Mason University
Dr. Uday Kamath is the chief data scientist at BAE Systems Applied Intelligence. He specializes in scalable machine learning and has spent 20 years in the domain of AML, fraud detection in financial crime, cyber security, and bioinformatics, to name a few. Dr. Kamath is responsible for key products in areas focusing on the behavioral, social networking and big data machine learning aspects of analytics at BAE AI. He received his PhD at George Mason University, under the able guidance of Dr. Kenneth De Jong, where his dissertation research focused on machine learning for big data and automated sequence mining.
I would like to thank my friend, Krishna Choppella, for accepting the offer to co-author this book and being an able partner on this long but satisfying journey.
Heartfelt thanks to our reviewers, especially Dr. Samir Sahli for his valuable comments, suggestions, and in-depth review of the chapters. I would like to thank Professor Carlotta Domeniconi for her suggestions and comments that helped us shape various chapters in the book. I would also like to thank all the Packt staff, especially Divya Poojari, Mayur Pawanikar, and Vivek Arora, for helping us complete the tasks in time. This book required making a lot of sacrifices on the personal front and I would like to thank my wife, Pratibha, and our nanny, Evelyn, for their unconditional support. Finally, thanks to all my lovely teachers and professors for not only teaching the subjects, but also instilling the joy of learning.
Krishna Choppella builds tools and client solutions in his role as a solutions architect for analytics at BAE Systems Applied Intelligence. He has been programming in Java for 20 years. His interests are data science, functional programming, and distributed computing.
Samir Sahli was awarded a BSc degree in applied mathematics and information sciences from the University of Nice Sophia-Antipolis, France, in 2004. He received MSc and PhD degrees in physics (specializing in optics/photonics/image science) from University Laval, Quebec, Canada, in 2008 and 2013, respectively. During his graduate studies, he worked with Defence Research and Development Canada (DRDC) on the automatic detection and recognition of targets in aerial imagery, especially in the context of uncontrolled environment and sub-optimal acquisition conditions. He has worked since 2009 as a consultant for several companies based in Europe and North America specializing in the area of Intelligence, Surveillance, and Reconnaissance (ISR) and in remote sensing.
Dr. Sahli joined McMaster Biophotonics in 2013 as a postdoctoral fellow. His research was in the field of optics, image processing, and machine learning. He was involved in several projects, such as the development of a novel generation of gastrointestinal tract imaging device, hyperspectral imaging of skin erythema for individualized radiotherapy treatment, and automatic detection of the precancerous Barrett's esophageal cell using fluorescence lifetime imaging microscopy and multiphoton microscopy.
Dr. Sahli joined BAE Systems Applied Intelligence in 2015. He has since worked as a data scientist to develop analytics models to detect complex fraud patterns and money laundering schemes for insurance, banking, and governmental clients using machine learning, statistics, and social network analysis tools.
Prashant Verma started his IT career in 2011 as a Java developer in Ericsson, working in the telecom domain. After a couple of years of Java EE experience, he moved into the big data domain and has worked on almost all of the popular big data technologies such as Hadoop, Spark, Flume, Mongo, Cassandra, and so on. He has also played with Scala. Currently, he works with QA Infotech as a lead data engineer, working on solving e-learning problems with analytics and machine learning.
Prashant has worked for many companies, such as Ericsson and QA Infotech, with domain knowledge of telecom and e-learning. He has also worked as a freelance consultant in his free time.
I want to thank Packt Publishing for giving me the chance to review the book, as well as my employer and my family for their patience while I was busy working on this book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1785880519.
If you'd like to join our team of regular reviewers, you can e-mail us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Dedicated to my parents, Krishna Kamath and Bharathi Kamath, my wife, Pratibha Shenoy, and the kids, Aaroh and Brandy
--Dr. Uday Kamath.To my parents
--Krishna ChoppellaThere are many notable books on machine learning, from pedagogical tracts on the theory of learning from data; to standard references on specializations in the field, such as clustering and outlier detection or probabilistic graph modeling; to cookbooks that offer practical advice on the use of tools and libraries in a particular language. The books that tend to be broad in coverage are often short on theoretical detail, while those with a focus on one topic or tool may not, for example, have much to say about the difference in approach in a streaming as opposed to a batch environment. Besides, for the non-novices with a preference for tools in Java who wish to reach for a single volume that will extend their knowledge—simultaneously, on the essential aspects—there are precious few options.
Finding in one place
The core idea of this book, therefore, is to address this gap while maintaining a balance between treatment of theory and practice with the aid of probability, statistics, basic linear algebra, and rudimentary calculus in the service of one, and emphasizing methodology, case studies, tools and code in support of the other.
According to the KDnuggets 2016 software poll, Java, at 16.8%, has the second highest share in popularity among languages used in machine learning, after Python. What's more is that this marks a 19% increase from the year before! Clearly, Java remains an important and effective vehicle to build and deploy systems involving machine learning, despite claims of its decline in some quarters. With this book, we aim to reach professionals and motivated enthusiasts with some experience in Java and a beginner's knowledge of machine learning. Our goal is to make Mastering Java Machine Learning the next step on their path to becoming advanced practitioners in data science. To guide them on this path, the book covers a veritable arsenal of techniques in machine learning—some which they may already be familiar with, others perhaps not as much, or only superficially—including methods of data analysis, learning algorithms, evaluation of model performance, and more in supervised and semi-supervised learning, clustering and anomaly detection, and semi-supervised and active learning. It also presents special topics such as probabilistic graph modeling, text mining, and deep learning. Not forgetting the increasingly important topics in enterprise-scale systems today, the book also covers the unique challenges of learning from evolving data streams and the tools and techniques applicable to real-time systems, as well as the imperatives of the world of Big Data:
This book explains how to apply machine learning to real-world data and real-world domains with the right methodology, processes, applications, and analysis. Accompanying each chapter are case studies and examples of how to apply the newly learned techniques using some of the best available open source tools written in Java. This book covers more than 15 open source Java tools supporting a wide range of techniques between them, with code and practical usage. The code, data, and configurations are available for readers to download and experiment with. We present more than ten real-world case studies in Machine Learning that illustrate the data scientist's process. Each case study details the steps undertaken in the experiments: data ingestion, data analysis, data cleansing, feature reduction/selection, mapping to machine learning, model training, model selection, model evaluation, and analysis of results. This gives the reader a practical guide to using the tools and methods presented in each chapter for solving the business problem at hand.
Chapter 1, Machine Learning Review, is a refresher of basic concepts and techniques that the reader would have learned from Packt's Learning Machine Learning in Java or a similar text. This chapter is a review of concepts such as data, data transformation, sampling and bias, features and their importance, supervised learning, unsupervised learning, big data learning, stream and real-time learning, probabilistic graphic models, and semi-supervised learning.
Chapter 2, Practical Approach to Real-World Supervised Learning, cobwebs dusted, dives straight into the vast field of supervised learning and the full spectrum of associated techniques. We cover the topics of feature selection and reduction, linear modeling, logistic models, non-linear models, SVM and kernels, ensemble learning techniques such as bagging and boosting, validation techniques and evaluation metrics, and model selection. Using WEKA and RapidMiner, we carry out a detailed case study, going through all the steps from data analysis to analysis of model performance. As in each of the other chapters, the case study is presented as an example to help the reader understand how the techniques introduced in the chapter are applied in real life. The dataset used in the case study is UCI HorseColic.
Chapter 3, Unsupervised Machine Learning Techniques, presents many advanced methods in clustering and outlier techniques, with applications. Topics covered are feature selection and reduction in unsupervised data, clustering algorithms, evaluation methods in clustering, and anomaly detection using statistical, distance, and distribution techniques. At the end of the chapter, we perform a case study for both clustering and outlier detection using a real-world image dataset, MNIST. We use the Smile API to do feature reduction and ELKI for learning.
Chapter 4, Semi-supervised Learning and Active Learning, gives details of algorithms and techniques for learning when only a small amount labeled data is present. Topics covered are self-training, generative models, transductive SVMs, co-training, active learning, and multi-view learning. The case study involves both learning systems and is performed on the real-world UCI Breast Cancer Wisconsin dataset. The tools introduced are JKernelMachines ,KEEL and JCLAL.
Chapter 5, Real-Time Stream Machine Learning, covers data streams in real-time present unique circumstances for the problem of learning from data. This chapter broadly covers the need for stream machine learning and applications, supervised stream learning, unsupervised cluster stream learning, unsupervised outlier learning, evaluation techniques in stream learning, and metrics used for evaluation. A detailed case study is given at the end of the chapter to illustrate the use of the MOA framework. The dataset used is Electricity (ELEC).
Chapter 6, Probabilistic Graph Modeling, shows that many real-world problems can be effectively represented by encoding complex joint probability distributions over multi-dimensional spaces. Probabilistic graph models provide a framework to represent, draw inferences, and learn effectively in such situations. The chapter broadly covers probability concepts, PGMs, Bayesian networks, Markov networks, Graph Structure Learning, Hidden Markov Models, and Inferencing. A detailed case study on a real-world dataset is performed at the end of the chapter. The tools used in this case study are OpenMarkov and WEKA's Bayes network. The dataset is UCI Adult (Census Income).
Chapter 7, Deep Learning, If there is one super-star of machine learning in the popular imagination today it is deep learning, which has attained a dominance among techniques used to solve the most complex AI problems. Topics broadly covered are neural networks, issues in neural networks, deep belief networks, restricted Boltzman machines, convolutional networks, long short-term memory units, denoising autoencoders, recurrent networks, and others. We present a detailed case study showing how to implement deep learning networks, tuning the parameters and performing learning. We use DeepLearning4J with the MNIST image dataset.
Chapter 8, Text Mining and Natural Language Processing, details the techniques, algorithms, and tools for performing various analyses in the field of text mining. Topics broadly covered are areas of text mining, components needed for text mining, representation of text data, dimensionality reduction techniques, topic modeling, text clustering, named entity recognition, and deep learning. The case study uses real-world unstructured text data (the Reuters-21578 dataset) highlighting topic modeling and text classification; the tools used are MALLET and KNIME.
Chapter 9, Big Data Machine Learning – the Final Frontier, discusses some of the most important challenges of today. What learning options are available when data is either big or available at a very high velocity? How is scalability handled? Topics covered are big data cluster deployment frameworks, big data storage options, batch data processing, batch data machine learning, real-time machine learning frameworks, and real-time stream learning. In the detailed case study for both big data batch and real-time we select the UCI Covertype dataset and the machine learning libraries H2O, Spark MLLib and SAMOA.
Appendix A, Linear Algebra, covers concepts from linear algebra, and is meant as a brief refresher. It is by no means complete in its coverage, but contains a whirlwind tour of some important concepts relevant to the machine learning techniques featured in the book. It includes vectors, matrices and basic matrix operations and properties, linear transformations, matrix inverse, eigen decomposition, positive definite matrix, and singular value decomposition.
Appendix B, Probability, provides a brief primer on probability. It includes the axioms of probability, Bayes' theorem, density estimation, mean, variance, standard deviation, Gaussian standard deviation, covariance, correlation coefficient, binomial distribution, Poisson distribution, Gaussian distribution, central limit theorem, and error propagation.
This book assumes you have some experience of programming in Java and a basic understanding of machine learning concepts. If that doesn't apply to you, but you are curious nonetheless and self-motivated, fret not, and read on! For those who do have some background, it means that you are familiar with simple statistical analysis of data and concepts involved in supervised and unsupervised learning. Those who may not have the requisite math or must poke the far reaches of their memory to shake loose the odd formula or funny symbol, do not be disheartened. If you are the sort that loves a challenge, the short primer in the appendices may be all you need to kick-start your engines—a bit of tenacity will see you through the rest! For those who have never been introduced to machine learning, the first chapter was equally written for you as for those needing a refresher—it is your starter-kit to jump in feet first and find out what it's all about. You can augment your basics with any number of online resources. Finally, for those innocent of Java, here's a secret: many of the tools featured in the book have powerful GUIs. Some include wizard-like interfaces, making them quite easy to use, and do not require any knowledge of Java. So if you are new to Java, just skip the examples that need coding and learn to use the GUI-based tools instead!
The primary audience of this book is professionals who works with data and whose responsibilities may include data analysis, data visualization or transformation, the training, validation, testing and evaluation of machine learning models—presumably to perform predictive, descriptive or prescriptive analytics using Java or Java-based tools. The choice of Java may imply a personal preference and therefore some prior experience programming in Java. On the other hand, perhaps circumstances in the work environment or company policies limit the use of third-party tools to only those written in Java and a few others. In the second case, the prospective reader may have no programming experience in Java. This book is aimed at this reader just as squarely as it is at their colleague, the Java expert (who came up with the policy in the first place).
A secondary audience can be defined by a profile with two attributes alone: an intellectual curiosity about machine learning and the desire for a single comprehensive treatment of the concepts, the practical techniques, and the tools. A specimen of this type of reader can opt to skip the math and the tools and focus on learning the most common supervised and unsupervised learning algorithms alone. Another might skim over Chapters 1, 2, 3, and 7, skip the others entirely, and jump headlong into the tools—a perfectly reasonable strategy if you want to quickly make yourself useful analyzing that dataset the client said would be here any day now. Importantly, too, with some practice reproducing the experiments from the book, it'll get you asking the right questions of the gurus! Alternatively, you might want to use this book as a reference to quickly look up the details of the algorithm for affinity propagation (Chapter 3, Unsupervised Machine Learning Techniques), or remind yourself of an LSTM architecture with a brief review of the schematic (Chapter 7, Deep Learning), or dog-ear the page with the list of pros and cons of distance-based clustering methods for outlier detection in stream-based learning (Chapter 5, Real-Time Stream Machine Learning). All specimens are welcome and each will find plenty to sink their teeth into.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the code files by following these steps:
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
The code bundle for the book is also hosted on GitHub at https://github.com/mjmlbook/mastering-java-machine-learning. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
Recent years have seen the revival of artificial intelligence (AI) and machine learning in particular, both in academic circles and the industry. In the last decade, AI has seen dramatic successes that eluded practitioners in the intervening years since the original promise of the field gave way to relative decline until its re-emergence in the last few years.
What made these successes possible, in large part, was the impetus provided by the need to process the prodigious amounts of ever-growing data, key algorithmic advances by dogged researchers in deep learning, and the inexorable increase in raw computational power driven by Moore's Law. Among the areas of AI leading the resurgence, machine learning has seen spectacular developments, and continues to find the widest applicability in an array of domains. The use of machine learning to help in complex decision making at the highest levels of business and, at the same time, its enormous success in improving the accuracy of what are now everyday applications, such as searches, speech recognition, and personal assistants on mobile phones, have made its effects commonplace in the family room and the board room alike. Articles breathlessly extolling the power of deep learning can be found today not only in the popular science and technology press but also in mainstream outlets such as The New York Times and The Huffington Post. Machine learning has indeed become ubiquitous in a relatively short time.
An ordinary user encounters machine learning in many ways in their day-to-day activities. Most e-mail providers, including Yahoo and Gmail, give the user automated sorting and categorization of e-mails into headings such as Spam, Junk, Promotions, and so on, which is made possible using text mining, a branch of machine learning. When shopping online for products on e-commerce websites, such as https://www.amazon.com/, or watching movies from content providers, such as Netflix, one is offered recommendations for other products and content by so-called recommender systems, another branch of machine learning, as an effective way to retain customers.
Forecasting the weather, estimating real estate prices, predicting voter turnout, and even election results—all use some form of machine learning to see into the future, as it were.
The ever-growing availability of data and the promise of systems that can enrich our lives by learning from that data place a growing demand on the skills of the limited workforce of professionals in the field of data science. This demand is particularly acute for well-trained experts who know their way around the landscape of machine learning techniques in the more popular languages, such as Java, Python, R, and increasingly, Scala. Fortunately, thanks to the thousands of contributors in the open source community, each of these languages has a rich and rapidly growing set of libraries, frameworks, and tutorials that make state-of-the-art techniques accessible to anyone with an internet connection and a computer, for the most part. Java is an important vehicle for this spread of tools and technology, especially in large-scale machine learning projects, owing to its maturity and stability in enterprise-level deployments and the portable JVM platform, not to mention the legions of professional programmers who have adopted it over the years. Consequently, mastery of the skills so lacking in the workforce today will put any aspiring professional with a desire to enter the field at a distinct advantage in the marketplace.
Perhaps you already apply machine learning techniques in your professional work, or maybe you simply have a hobbyist's interest in the subject. If you're reading this, it's likely you can already bend Java to your will, no problem, but now you feel you're ready to dig deeper and learn how to use the best of breed open source ML Java frameworks in your next data science project. If that is indeed you, how fortuitous is it that the chapters in this book are designed to do all that and more!
Mastery of a subject, especially one that has such obvious applicability as machine learning, requires more than an understanding of its core concepts and familiarity with its mathematical underpinnings. Unlike an introductory treatment of the subject, a book that purports to help you master the subject must be heavily focused on practical aspects in addition to introducing more advanced topics that would have stretched the scope of the introductory material. To warm up before we embark on sharpening our skills, we will devote this chapter to a quick review of what we already know. For the ambitious novice with little or no prior exposure to the subject (who is nevertheless determined to get the fullest benefit from this book), here's our advice: make sure you do not skip the rest of this chapter; instead, use it as a springboard to explore unfamiliar concepts in more depth. Seek out external resources as necessary. Wikipedia them. Then jump right back in.
For the rest of this chapter, we will review the following:
It is difficult to give an exact history, but the definition of machine learning we use today finds its usage as early as the 1860s. In Rene Descartes' Discourse on the Method, he refers to Automata and says:
For we can easily understand a machine's being constituted so that it can utter words, and even emit some responses to action on it of a corporeal kind, which brings about a change in its organs; for instance, if touched in a particular part it may ask what we wish to say to it; if in another part it may exclaim that it is being hurt, and so on.
http://www.earlymoderntexts.com/assets/pdfs/descartes1637.pdf
https://www.marxists.org/reference/archive/descartes/1635/discourse-method.htm
Alan Turing, in his famous publication Computing Machinery and Intelligence gives basic insights into the goals of machine learning by asking the question "Can machines think?".
http://csmt.uchicago.edu/annotations/turing.htm
http://www.csee.umbc.edu/courses/471/papers/turing.pdf
Arthur Samuel in 1959 wrote, "Machine Learning is the field of study that gives computers the ability to learn without being explicitly programmed.".
Tom Mitchell in recent times gave a more exact definition of machine learning: "A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E."
Machine learning has a relationship with several areas:
It is important to recognize areas that share a connection with machine learning but cannot themselves be considered part of machine learning. Some disciplines may overlap to a smaller or larger extent, yet the principles underlying machine learning are quite distinct:
In this section, we will describe the different concepts and terms normally used in machine learning:
To illustrate these concepts, a concrete example in the form of a commonly used sample weather dataset is given. The data gives a set of weather conditions and a label that indicates whether the subject decided to play a game of tennis on the day or not:
The dataset is in the format of an ARFF (attribute-relation file format) file. It consists of a header giving the information about features or attributes with their data types and actual comma-separated data following the data tag. The dataset has five features, namely outlook, temperature, humidity, windy, and play. The features outlook and windy are categorical features, while humidity and temperature are continuous. The feature play is the target and is categorical.
We will now explore different subtypes or branches of machine learning. Though the following list is not comprehensive, it covers the most well-known types: