23,92 €
This quick start guide will bring the readers to a basic level of understanding when it comes to the Machine Learning (ML) development lifecycle, will introduce Go ML libraries and then will exemplify common ML methods such as Classification, Regression, and Clustering
Key Features
Book Description
Machine learning is an essential part of today's data-driven world and is extensively used across industries, including financial forecasting, robotics, and web technology. This book will teach you how to efficiently develop machine learning applications in Go.
The book starts with an introduction to machine learning and its development process, explaining the types of problems that it aims to solve and the solutions it offers. It then covers setting up a frictionless Go development environment, including running Go interactively with Jupyter notebooks. Finally, common data processing techniques are introduced.
The book then teaches the reader about supervised and unsupervised learning techniques through worked examples that include the implementation of evaluation metrics. These worked examples make use of the prominent open-source libraries GoML and Gonum.
The book also teaches readers how to load a pre-trained model and use it to make predictions. It then moves on to the operational side of running machine learning applications: deployment, Continuous Integration, and helpful advice for effective logging and monitoring.
At the end of the book, readers will learn how to set up a machine learning project for success, formulating realistic success criteria and accurately translating business requirements into technical ones.
What you will learn
Who this book is for
This book is for developers and data scientists with at least beginner-level knowledge of Go, and a vague idea of what types of problem Machine Learning aims to tackle. No advanced knowledge of Go (and no theoretical understanding of the math that underpins Machine Learning) is required.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 199
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor:Amey VarangaonkarAcquisition Editor:Aditi GourContent Development Editor:Roshan KumarTechnical Editor: Sagar SawantCopy Editor: Safis EditingProject Coordinator:Namrata SwettaProofreader: Safis EditingIndexer:Manju ArasanGraphics:Jisha ChirayilProduction Coordinator:Aparna Bhagat
First published: May 2019
Production reference: 1310519
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-83855-035-6
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Michael Bironneau is an award-winning mathematician and experienced software engineer. He holds a PhD in mathematics from Loughborough University and has worked in several data science and software development roles. He is currently technical director of the energy AI technology company, Open Energi.
Toby Coleman is an experienced data science and machine learning practitioner. Following degrees from Cambridge University and Imperial College London, he has worked on the application of data science techniques in the banking and energy sectors. Recently, he held the position of innovation director at cleantech SME Open Energi, and currently provides machine learning consultancy to start-up businesses.
Niclas Jern has been using computers for fun and profit since he got his first computer (a C64) at the age of four. After a prolonged period of combining the founding and running of a start-up, Walkbase, with his university studies, he graduated from Åbo Akademi University with an M.Sc. in computer engineering in 2015. His hobbies include long walks, lifting heavy metal objects at the gym, and spending quality time with his wife and daughter. He currently works at Stratacache, which acquired Walkbase in 2017, where he continues to lead the Walkbase engineering teams and design and build the future of retail technology.
Philipp Mieden is a German security researcher and software engineer, currently focusing on network security monitoring with applied machine learning. He presented his research on classifying malicious behavior in network traffic at several international contests and conferences and won multiple prizes. He holds a B.Sc. degree from the Ludwig Maximilian University of Munich, and shares many of his projects on GitHub. Besides network anomaly detection, Philipp is also interested in hardware security, industrial control systems, and reverse engineering malware.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Machine Learning with Go Quick Start Guide
About Packt
Why subscribe?
Packt.com
Contributors
About the authors
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Introducing Machine Learning with Go
What is ML?
Types of ML algorithms
Supervised learning problems
Unsupervised learning problems
Why write ML applications in Go?
The advantages of Go
Go's mature ecosystem
Transfer knowledge and models created in other languages
ML development life cycle
Defining problem and objectives
Acquiring and exploring data
Selecting the algorithm
Preparing data
Training
Validating/testing
Integrating and deploying
Re-validating
Summary
Further readings
Setting Up the Development Environment
Installing Go
Linux, macOS, and FreeBSD
Windows
Running Go interactively with gophernotes
Example – the most common phrases in positive and negative reviews
Initializing the example directory and downloading the dataset
Loading the dataset files
Parsing contents into a Struct
Loading the data into a Gota dataframe
Finding the most common phrases
Example – exploring body mass index data with gonum/plot
Installing gonum and gonum/plot
Loading the data
Understanding the distributions of the data series
Example – preprocessing data with Gota
Loading the data into Gota
Removing and renaming columns
Converting a column into a different type
Filtering out unwanted data
Normalizing the Height, Weight, and Age columns
Sampling to obtain training/validation subsets
Encoding data with categorical variables
Summary
Further readings
Supervised Learning
Classification
A simple model – the logistic classifier
Measuring performance
Precision and recall
ROC curves
Multi-class models
A non-linear model – the support vector machine
Overfitting and underfitting
Deep learning
Neural networks
A simple deep learning model architecture
Neural network training
Regression
Linear regression
Random forest regression
Other regression models
Summary
Further readings
Unsupervised Learning
Clustering
Principal component analysis
Summary
Further readings
Using Pretrained Models
How to restore a saved GoML model
Deciding when to adopt a polyglot approach
Example – invoking a Python model using os/exec
Example – invoking a Python model using HTTP
Example – deep learning using the TensorFlow API for Go
Installing TensorFlow
Import the pretrained TensorFlow model
Creating inputs to the TensorFlow model
Summary
Further readings
Deploying Machine Learning Applications
The continuous delivery feedback loop
Developing
Testing
Deployment
Dependencies
Model persistence
Monitoring
Structured logging
Capturing metrics
Feedback
Deployment models for ML applications
Infrastructure-as-a-service
Amazon Web Services
Microsoft Azure
Google Cloud
Platform-as-a-Service
Amazon Web Services
Amazon Sagemaker
Amazon AI Services
Microsoft Azure
Azure ML Studio
Azure Cognitive Services
Google Cloud
AI Platform
AI Building Blocks
Summary
Further readings
Conclusion - Successful ML Projects
When to use ML
Typical stages in a ML project
Business and data understanding
Data preparation
Modelling and evaluation
Deployment
When to combine ML with traditional code
Summary
Further readings
Other Books You May Enjoy
Leave a review - let other readers know what you think
Machine learning (ML) plays a vital part in the modern data-driven world, and has been extensively adopted in various fields across financial forecasting, effective searching, robotics, digital imaging in healthcare, and many more besides. It is a rapidly evolving field, with new algorithms and datasets being published every week, both by academics and technology companies. This book will teach you how to perform various machine learning tasks using Go in different environments.
You will learn about many important techniques that are required to develop ML applications in Go, and deploy them as production systems. The best way to develop your knowledge is with hands-on experience, so dive in and start adding ML software to your own Go applications.
This book is intended for developers and data scientists with at least a beginner-level knowledge of Go, and a vague idea of what types of problems ML aims to tackle. No advanced knowledge of Go, or a theoretical understanding of the mathematics that underpins ML, is required.
Chapter 1, Introducing Machine Learning with Go, introduces ML and the different types of ML-related problems. We will also look into the ML development life cycle, and the process of creating and taking an ML application to production.
Chapter 2, Setting Up the Development Environment, explains how to set up an environment for ML applications and Go. We will also gain an understanding of how to install an interactive environment, Jupyter, to accelerate data exploration and visualization using libraries such as Gota and gonum/plot.
Chapter 3, Supervised Learning, introduces supervised learning algorithms and demonstrates how to choose an ML algorithm, train it, and validate its predictive power on previously unseen data.
Chapter 4, Unsupervised Learning, reuses many of the techniques related to data loading and preparation that we have implemented in this book, but will focuses instead on unsupervised machine learning.
Chapter 5, Using Pretrained Models, describes how to load a pretrained Go ML model and use it to generate a prediction. We will also gain an understanding of how to use HTTP to invoke ML models written in other languages, where they may reside on a different machine or even on the internet.
Chapter 6, Deploying Machine Learning Applications, covers the final stage of the ML development life cycle: taking an ML application written in Go to production.
Chapter 7, Conclusion – Successful ML Projects, takes a step back and examines ML development from a project management point of view.
The code samples, including bash scripts and installation instructions, were tested on an Ubuntu 16.04 server with 8 GB of RAM and a 500 GB SSD hard drive. A machine with similar specifications will be required.
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Machine-Learning-with-Go-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: http://www.packtpub.com/sites/default/files/downloads/9781838550356_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The go-deep library lets us build this architecture very quickly."
A block of code is set as follows:
categories := []string{"tshirt", "trouser", "pullover", "dress", "coat", "sandal", "shirt", "shoe", "bag", "boot"}
Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Create a new Notebook by clicking on New | Go:"
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
All around us, automation is changing our lives in subtle increments that live on the bleeding edge of mathematics and computer science. What do a Nest thermostat, Netflix's movie recommendations and Google's Images search algorithm all have in common? Created by some of the brightest minds in todays software industry, these technologies all rely on machine learning (ML) techniques.
In February 2019, Crunchbase listed over 4,700 companies that categorized themselves as Artificial Intelligence (AI) or ML[1]. Most of these companies were very early stage and funded by angel investors or early round funding from venture capitalists. Yet articles in 2017 and 2018 by Crunchbase, and the UK Financial Times, center around a common recognition that ML is increasingly relied upon for sustained growth[2], and that its increasing maturity will lead to even more widespread applications[3], particularly if challenges around the opacity of decisions made by ML algorithms can be solved[4]. The New York Times even has a column dedicated to ML[5], a tribute to its importance in everyday life.
This book will teach a software engineer with intermediate knowledge of the Go programming language how to write and produce an ML application from concept to deployment, and beyond. We will first categorize problems suitable for ML techniques and the life cycle of ML applications. Then, we will explain how to set up a development environment specifically suited for data science with the Go language. Then, we will provide a practical guide to the main ML algorithms, their implementations, and their pitfalls. We will also provide some guidance on using ML models produced using other programming languages and integrating them in Go applications. Finally, we will consider different deployment models and the elusive intersection between DevOps and data science. We will conclude with some remarks on managing ML projects from our own experience.
In our first chapter, we will introduce some fundamental concepts of Go ML applications:
What is ML?
Types of ML problems
Why write ML applications in Go?
The ML development life cycle
ML is a field at the intersection of statistics and computer science. The output of this field has been a collection of algorithms capable of operating autonomously by inferring the best decision or answer to a question from a dataset. Unlike traditional programming, where the programmer must decide the rules of the program and painstakingly encode these in the syntax of their chosen programming language, ML algorithms require only sufficient quantities of prepared data, computing power to learn from the data, and often some knowledge to tweak the algorithms parameters to improve the final result.
The resulting systems are very flexible and can be excellent at capitalizing on patterns that human beings would miss. Imagine writing a recommender system for a TV series from scratch. Perhaps you might begin by defining the inputs and the outputs of the problem, then finding a database of TV series that had such details as their date of release, genre, cast, and director. Finally, you might create a score func that rates a pair of series more highly if their release dates are close, they have the same genre, share actors, or have the same director.
Given one TV series, you could then rank all other TV series by decreasing similarity score and present the first few to the user. When creating the score func, you would make judgement calls on the relative importance of the various features, such as deciding that each pair of shared actors between two series is worth one point. This type of guesswork, also known as a heuristic, is what ML algorithms aim to do for you, saving time and improving the accuracy of the final result, especially if user preferences shift and you have to change the scoring func regularly to keep up.
The distinction between the broader field of AI and ML is a murky one. While the hype surrounding ML may be relatively new[6], the history of the field began in 1959 when Arthur Samuel, a leading expert in AI, first used these words[7]. In the 1950s, ML concepts such as the perceptron and genetic algorithms were invented by the likes of Alan Turing[8] as well as Samuel himself. In the following decades, practical and theoretical difficulties in achieving general AI, led to approaches such as rule-based methods such as expert systems, which did not learn from data, but rather from expert-devised rules which they had learned over many years, encoded in if-else statements.
In the 1990s, recognizing that achieving AI was unlikely with existing technology, there was an increasing appetite for a narrow approach to tackling very specific problems that could be solved using a combination of statistics and probability theory. This led to the development of ML as a separate field. Today, ML and AI are often used interchangeably, particularly in marketing literature[9].
There are two main categories of ML algorithms: supervised learning and unsupervised learning. The decision of which type of algorithm to use depends on the data you have available and the project objectives.
Supervised learning problems aim to infer the best mapping between an input and output dataset based on provided labeled pairs of input/output. The labeled dataset acts as feedback for the algorithm, allowing it to gauge the optimality of its solution. For example, given a list of mean yearly crude oil prices from 2010-2018, you may wish to predict the mean yearly crude oil price of 2019. The error that the algorithm makes on the 2010-2018 years will allow the engineer to estimate its error on the target prediction year of 2019.
Given a labeled collection of handwritten digits, you may wish to predict the label of a previously unseen handwritten digit. Similarly, given a dataset of emails that are labeled as being either spam or not spam, a company that wants to create a spam filter would want to predict whether a previously unseen message was spam. All these problems are supervised learning problems.
Supervised ML problems can be further divided into prediction and classification:
Classification attempts to label an unknown input sample with a known output value. For example, you could train an algorithm to recognize breeds of cats. The algorithm would classify an unknown cat by labeling it with a known breed.
By contrast, prediction algorithms attempt to label an unknown input sample with either a known or unknown output value. This is also known as
estimation
or
regression
. A canonical prediction problem is time series forecasting, where the output value of the series is predicted for a time value that was not previously seen.
We will cover supervised algorithms in more detail in Chapter 3, Supervised Learning.
