29,99 €
Machine learning as a field promises to bring increased intelligence to the software by helping us learn and analyse information efficiently and discover certain patterns that humans cannot. This book will be your guide as you embark on an exciting journey in machine learning using the popular Swift language.
We’ll start with machine learning basics in the first part of the book to develop a lasting intuition about fundamental machine learning concepts. We explore various supervised and unsupervised statistical learning techniques and how to implement them in Swift, while the third section walks you through deep learning techniques with the help of typical real-world cases. In the last section, we will dive into some hard core topics such as model compression, GPU acceleration and provide some recommendations to avoid common mistakes during machine learning application development.
By the end of the book, you'll be able to develop intelligent applications written in Swift that can learn for themselves.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 383
Veröffentlichungsjahr: 2018
Copyright © 2018 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Commissioning Editor: Veena PagareAcquisition Editor: Vinay ArgekarContent Development Editor: Mayur PawanikarTechnical Editor: Dinesh PawarCopy Editor: Vikrant Phadkay, Safis EditingProject Coordinator: Nidhi JoshiProofreader: Safis EditingIndexer: Pratik ShirodkarGraphics: Tania DuttaProduction Coordinator: Arvindkumar Gupta
First published: February 2018
Production reference: 1270218
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-78712-151-5
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Alexander Sosnovshchenko has been working as an iOS software engineer since 2012. Later he made his foray into data science, from the first experiments with mobile machine learning in 2014, to complex deep learning solutions for detecting anomalies in video surveillance data. He lives in Lviv, Ukraine, and has a wife and a daughter.
Jojo Moolayil is an artificial intelligence, deep learning, and machine learning professional with over 5 years of experience and is the author of Smarter Decisions – The Intersection of Internet of Things and Decision Science. He works with GE and lives in Bengaluru, India. He has also been a technical reviewer about various books in machine learning, deep learning, and business analytics with Apress and Packt.
Cecil Costa, also known as Eduardo Campos in Latin American countries, is a Euro-Brazilian freelance developer who has been learning about computers since he got his first PC in 1990. Learning is his passion, and so is teaching; this is why he works as a trainer. He has organized both on-site and online courses for companies. He is also the author of a few Swift books.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright and Credits
Machine Learning with Swift
Packt Upsell
Why subscribe?
PacktPub.com
Contributors
About the author
About the reviewers
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Getting Started with Machine Learning
What is AI?
The motivation behind ML
What is ML ?
Applications of ML
Digital signal processing (DSP)
Computer vision
Natural language processing (NLP)
Other applications of ML
Using ML to build smarter iOS applications
Getting to know your data
Features
Types of features
Choosing a good set of features
Getting the dataset
Data preprocessing
Choosing a model
Types of ML algorithms
Supervised learning
Unsupervised learning
Reinforcement learning
Mathematical optimization – how learning works
Mobile versus server-side ML
Understanding mobile platform limitations
Summary
Bibliography
Classification – Decision Tree Learning
Machine learning toolbox
Prototyping the first machine learning app
Tools
Setting up a machine learning environment
IPython notebook crash course
Time to practice
Machine learning for extra-terrestrial life explorers
Loading the dataset
Exploratory data analysis
Data preprocessing
Converting categorical variables
Separating features from labels
One-hot encoding
Splitting the data
Decision trees everywhere
Training the decision tree classifier
Tree visualization
Making predictions
Evaluating accuracy
Tuning hyperparameters
Understanding model capacity trade-offs
How decision tree learning works
Building a tree automatically from data
Combinatorial entropy
Evaluating performance of the model with data
Precision, recall, and F1-score
K-fold cross-validation
Confusion matrix
Implementing first machine learning app in Swift
Introducing Core ML
Core ML features
Exporting the model for iOS
Ensemble learning random forest
Training the random forest
Random forest accuracy evaluation
Importing the Core ML model into an iOS project
Evaluating performance of the model on iOS
Calculating the confusion matrix
Decision tree learning pros and cons
Summary
K-Nearest Neighbors Classifier
Calculating the distance
DTW
Implementing DTW in Swift
Using instance-based models for classification and clustering
People motion recognition using inertial sensors
Understanding the KNN algorithm
Implementing KNN in Swift
Recognizing human motion using KNN
Cold start problem
Balanced dataset
Choosing a good k
Reasoning in high-dimensional spaces
KNN pros
KNN cons
Improving our solution
Probabilistic interpretation
More data sources
Smarter time series chunking
Hardware acceleration
Trees to speed up the inference
Utilizing state transitions
Summary
Bibliography
K-Means Clustering
Unsupervised learning
K-means clustering
Implementing k-means in Swift
Update step
Assignment step
Clustering objects on a map
Choosing the number of clusters
K-means clustering – problems
K-means++
Image segmentation using k-means
Summary
Association Rule Learning
Seeing association rules
Defining data structures
Using association measures to assess rules
Supporting association measures
Confidence association measures
Lift association measures
Conviction association measures
Decomposing the problem
Generating all possible rules
Finding frequent item sets
The Apriori algorithm
Implementing Apriori in Swift
Running Apriori
Running Apriori on real-world data
The pros and cons of Apriori
Building an adaptable user experience
Summary
Bibliography
Linear Regression and Gradient Descent
Understanding the regression task
Introducing simple linear regression
Fitting a regression line using the least squares method
Where to use GD and normal equation
Using gradient descent for function minimization
Forecasting the future with simple linear regression
Feature scaling
Feature standardization
Multiple linear regression
Implementing multiple linear regression in Swift
Gradient descent for multiple linear regression
Training multiple regression
Linear algebra operations
Feature-wise standardization
Normal equation for multiple linear regression
Understanding and overcoming the limitations of linear regression
Fixing linear regression problems with regularization
Ridge regression and Tikhonov regularization
LASSO regression
ElasticNet regression
Summary
Bibliography
Linear Classifier and Logistic Regression
Revisiting the classification task
Linear classifier
Logistic regression
Implementing logistic regression in Swift
The prediction part of logistic regression
Training the logistic regression
Cost function
Predicting user intents
Handling dates
Choosing the regression model for your problem
Bias-variance trade-off
Summary
Neural Networks
What are artificial NNs anyway?
Building the neuron
Non-linearity function
Step-like activation functions
Rectifier-like activation functions
Building the network
Building a neural layer in Swift
Using neurons to build logical functions
Implementing layers in Swift
Training the network
Vanishing gradient problem
Seeing biological analogies
Basic neural network subroutines (BNNS)
BNNS example
Summary
Convolutional Neural Networks
Understanding users emotions
Introducing computer vision problems
Introducing convolutional neural networks
Pooling operation
Convolution operation
Convolutions in CNNs
Building the network
Input layer
Convolutional layer
Fully-connected layers
Nonlinearity layers
Pooling layer
Regularization layers
Dropout
Batch normalization
Loss functions
Training the network
Training the CNN for facial expression recognition
Environment setup
Deep learning frameworks
Keras
Loading the data
Splitting the data
Data augmentation
Creating the network
Plotting the network structure
Training the network
Plotting loss
Making predictions
Saving the model in HDF5 format
Converting to Core ML format
Visualizing convolution filters
Deploying CNN to iOS
Summary
Bibliography
Natural Language Processing
NLP in the mobile development world
Word Association game
Python NLP libraries
Textual corpuses
Common NLP approaches and subtasks
Tokenization
Stemming
Lemmatization
Part-of-speech (POS) tagging
Named entity recognition (NER)
Removing stop words and punctuation
Distributional semantics hypothesis
Word vector representations
Autoencoder neural networks
Word2Vec
Word2Vec in Gensim
Vector space properties
iOS application
Chatbot anatomy
Voice input
NSLinguisticTagger and friends
Word2Vec on iOS
Text-to-speech output
UIReferenceLibraryViewController
Putting it all together
Word2Vec friends and relatives
Where to go from here?
Summary
Machine Learning Libraries
Machine learning and AI APIs
Libraries
General-purpose machine learning libraries
AIToolbox
BrainCore
Caffe
Caffe2
dlib
FANN
LearnKit
MLKit
Multilinear-math
MXNet
Shark
TensorFlow
tiny-dnn
Torch
YCML
Inference-only libraries
Keras
LibSVM
Scikit-learn
XGBoost
NLP libraries
Word2Vec
Twitter text
Speech recognition
TLSphinx
OpenEars
Computer vision
OpenCV
ccv
OpenFace
Tesseract
Low-level subroutine libraries
Eigen
fmincg-c
IntuneFeatures
SigmaSwiftStatistics
STEM
Swix
LibXtract
libLBFGS
NNPACK
Upsurge
YCMatrix
Choosing a deep learning framework
Summary
Optimizing Neural Networks for Mobile Devices
Delivering perfect user experience
Calculating the size of a convolutional neural network
Lossless compression
Compact CNN architectures
SqueezeNet
MobileNets
ShuffleNet
CondenseNet
Preventing a neural network from growing big
Lossy compression
Optimizing for inference
Network pruning
Weights quantization
Reducing precision
Other approaches
Facebook's approach in Caffe2
Knowledge distillation
Tools
An example of the network compression
Summary
Bibliography
Best Practices
Mobile machine learning project life cycle
Preparatory stage
Formulate the problem
Define the constraints
Research the existing approaches
Research the data
Make design choices
Prototype creation
Data preprocessing
Model training, evaluation, and selection
Field testing
Porting or deployment for a mobile platform
Production
Best practices
Benchmarking
Privacy and differential privacy
Debugging and visualization
Documentation
Machine learning gremlins
Data kobolds
Tough data
Biased data
Batch effects
Goblins of training
Product design ogres
Magical thinking
Cargo cult
Feedback loops
Uncanny valley effect
Recommended learning resources
Mathematical background
Machine learning
Computer vision
NLP
Summary
Machine learning, as a field, promises to bring increasing intelligence to software by helping us learn and analyze information efficiently and discover certain things that humans cannot. We'll start by developing lasting intuition about the fundamental machine learning concepts in the first section. We'll explore various supervised and unsupervised learning techniques in the second section. Then, the third section, will walk you through deep learning techniques with the help of common real-world cases. In the last section, we'll dive into hardcore topics such as model compression and GPU acceleration, and provide some recommendations to avoid common mistakes during machine learning application development. By the end of the book, you'll be able to develop intelligent applications written in Swift that can learn for themselves.
This book is for iOS developers who wish to create intelligent iOS applications, and data science professionals who are interested in performing machine learning using Swift. Familiarity with some basic Swift programming is all you need to get started with this book.
Chapter 1, Getting Started with Machine Learning, teaches the main concepts of machine learning.
Chapter 2, Classification – Decision Tree Learning, builds our first machine learning application.
Chapter 3, K-Nearest Neighbors Classifier, continues exploring classification algorithms, and we learn about instance-based learning algorithms.
Chapter 4, K-Means Clustering, continues with instance-based algorithms, this time focusing on an unsupervised clustering task.
Chapter 5, Association Rule Learning, explores unsupervised learning more deeply.
Chapter 6, Linear Regression and Gradient Descent, returns to supervised learning, but this time we switch our attention from non-parametric models, such as KNN and k-means, to parametric linear models.
Chapter 7, Linear Classifier and Logistic Regression, continues by building different, more complex models on top of linear regression: polynomial regression, regularized regression, and logistic regression.
Chapter 8, Neural Networks, implements our first neural network.
Chapter 9, Convolutional Neural Networks, continues NNs, but this time we focus on convolutional NNs, which are especially popular in the computer vision domain.
Chapter 10, Natural Language Processing, explores the amazing world of human natural language. We're also going to use neural networks to build several chatbots with different personalities.
Chapter 11, Machine Learning Libraries, overviews existing iOS-compatible libraries for machine learning.
Chapter 12, Optimizing Neural Networks for Mobile Devices, talks about deep neural network deployment on mobile platforms.
Chapter 13, Best Practices, discusses a machine learning app's life cycle, common problems in AI projects, and how to solve them.
You will need the following software to be able to smoothly sail through this book:
Homebrew 1.3.8 +
Python 2.7.x
pip 9.0.1+
Virtualenv 15.1.0+
IPython 5.4.1+
Jupyter 1.0.0+
SciPy 0.19.1+
NumPy 1.13.3+
Pandas 0.20.2+
Matplotlib 2.0.2+
Graphviz 0.8.2+
pydotplus 2.0.2+
scikit-learn 0.18.1+
coremltools 0.6.3+
Ruby (default macOS version)
Xcode 9.2+
Keras 2.0.6+ with TensorFlow 1.1.0+ backend
keras-vis 0.4.1+
NumPy 1.13.3+
NLTK 3.2.4+
Gensim 2.1.0+
OS required:
macOS High Sierra 10.13.3+
iOS 11+ or simulator
You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packtpub.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Machine-Learning-with-Swift. In case there's an update to the code, it will be updated on the existing GitHub repository. The author has also hosted the code bundle on his GitHub repository at: https://github.com/alexsosn/SwiftMLBook.
We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/MachineLearningwithSwift_ColorImages.pdf.
Feedback from our readers is always welcome.
General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packtpub.com.
We live in exciting times. Artificial intelligence (AI) and Machine Learning (ML) went from obscure mathematical and science fiction topics to become a part of mass culture. Google, Facebook, Microsoft, and others competed to become the first to give the world general AI. In November 2015, Google open sourced its ML framework with TensorFlow, which is suitable for running on supercomputers as well as smartphones, and since then has won a broad community. Shortly afterwards, other big companies followed the example. The best iOS app of 2016 (Apple Choice), viral photo editor Prisma owes its success entirely to a particular kind of ML algorithm: convolutional neural network (CNN). These systems were invented back in the nineties but became popular only in the noughties. Mobile devices only gained enough computational power to run them in 2014/2015. In fact, artificial neural networks became so important for practical applications that in iOS 10 Apple added native support for them in the metal and accelerate frameworks. Apple also opened Siri to third-party developers and introduced GameplayKit, a framework to add AI capabilities to your computer games. In iOS 11, Apple introduced Core ML, a framework for running pre-trained models on vendors' devices, and Vision framework for common computer vision tasks.
The best time to start learning about ML was 10 years ago. The next best time is right now.
In this chapter, we will cover the following topics:
Understanding what AI and ML is
Fundamental concepts of ML : model, dataset, and learning
Types of ML tasks
ML project life cycle
General purpose ML versus mobile ML
AI is a field of knowledge about building intelligent machines, whatever meaning you assign to the word intelligence. There are two different AI notions among researchers: strong AI and weak AI.
Strong AI, or artificial general intelligence (AGI), is a machine that is fully capable of imitating human-level intelligence, including consciousness, feelings, and mind. Presumably, it should be able to apply successfully its intelligence to any tasks. This type of AI is like a horizon—we always see it as a goal but we are still not there, despite all our struggles. The significant role here plays the AI effect: the things that were yesterday considered a feature of strong AI are today accepted as granted and trivial. In the sixties, people believed that playing board games like chess was a characteristic of strong AI. Today, we have programs that outperform the best human chess players, but we are still far from strong AI. Our iPhones are probably an AI from the eighties perspective: you can talk to them, and they can answer your questions and deliver information on any topic in just seconds. So, keeping strong AI as a distant goal, researchers focused on things at hand and called them weak AI: systems that have some features of intelligence, and can be applied to some narrow tasks. Among those tasks are automated reasoning, planning, creativity, communication with humans, a perception of its surrounding world, robotics, and emotions simulation. We will touch some of these tasks in this book, but mostly we will focus on ML because this domain of AI has found a lot of practical applications on mobile platforms in the recent years.
Let's start with an analogy. There are two ways of learning an unfamiliar language:
Learning the language rules by heart, using textbooks, dictionaries, and so on. That's how college students usually do it.
Observing live language: by communicating with native speakers, reading books, and watching movies. That's how children do it.
In both cases, you build in your mind the language model, or, as some prefer to say, develop a sense of language.
In the first case, you are trying to build a logical system based on rules. In this case, you will encounter many problems: the exceptions to the rule, different dialects, borrowing from other languages, idioms, and lots more. Someone else, not you, derived and described for you the rules and structure of the language.
In the second case, you derive the same rules from the available data. You may not even be aware of the existence of these rules, but gradually adjust yourself to the hidden structure and understand the laws. You use your special brain cells called mirror neurons, trying to mimic native speakers. This ability is honed by millions of years of evolution. After some time, when facing the wrong word usage, you just feel that something is wrong but you can't tell immediately what exactly.
In any case, the next step is to apply the resulting language model in the real world. Results may differ. In the first case, you will experience difficulty every time you find the missing hyphen or comma, but may be able to get a job as a proofreader at a publishing house. In the second case, everything will depend on the quality, diversity, and amount of the data on which you were trained. Just imagine a person in the center of New York who studied English through Shakespeare. Would he be able to have a normal conversation with people around him?
Now we'll put the computer in place of the person in our example. Two approaches, in this case, represent the two programming techniques. The first one corresponds to writing ad hoc algorithms consisting of conditions, cycles, and so on, by which a programmer expresses rules and structures. The second one represents ML , in which case the computer itself identifies the underlying structure and rules based on the available data.
The analogy is deeper than it seems at first glance. For many tasks, building the algorithms directly is impossibly hard because of the variability in the real world. It may require the work of experts in the domain, who must describe all rules and edge cases explicitly. Resulting models can be fragile and rigid. On the other hand, this same task can be solved by allowing computers to figure out the rules on their own from a reasonable amount of data. An example of such a task is face recognition. It's virtually impossible to formalize face recognition in terms of conventional imperative algorithms and data structures. Only recently, the task was successfully solved with the help of ML .
ML is a subdomain of AI that has demonstrated significant progress over the last decade, and remains a hot research topic. It is a branch of knowledge concerned with building algorithms that can learn from data and improve themselves with regards to the tasks they perform. ML allows computers to deduce the algorithm for some task or to extract hidden patterns from data. ML is known by several different names in different research communities: predictive analytics, data mining, statistical learning, pattern recognition, and so on. One can argue that these terms have some subtle differences, but essentially, they all overlap to the extent that you can use the terminology interchangeably.
ML is already everywhere around us. Search engines, targeted ads, face and voice recognition, recommender systems, spam filtration, self-driven cars, fraud detection in bank systems, credit scoring, automated video captioning, and machine translation—all these things are impossible to imagine without ML these days.
Over recent years, ML has owed its success to several factors:
The abundance of data in different forms (big data)
Accessible computational power and specialized hardware (clouds and GPUs)
The rise of open source and open access
Algorithmic advances
Any ML system includes three essential components: data, model, and task. The data is something you provide as an input to your model. A model is a type of mathematical function or computer program that performs the task. For instance, your emails are data, the spam filter is a model, and telling spam apart from non-spam is a task. The learning in ML stands for a process of adjusting your model to the data so that the model becomes better at its task. The obvious consequences of this setup is expressed in the piece of wisdom well-known among statisticians, "Your model is only as good as your data".
There are many domains where ML is an indispensable ingredient, some of them are robotics, bioinformatics, and recommender systems. While nothing prevents you from writing bioinformatic software in Swift for macOS or Linux, we will restrict our practical examples in this book to more mobile-friendly domains. The apparent reason for this is that currently, iOS remains the primary target platform for most of the programmers who use Swift on a day-to-day basis.
For the sake of convenience, we'll roughly divide all ML applications of interest for mobile developers into three plus one areas, according to the datatypes they deal with most commonly:
Digital signal processing (sensor data, audio)
Computer vision (images, video)
Natural language processing (texts, speech)
Other applications and datatypes
This category includes tasks where input data types are signals, time series, and audio. The sources of the data are sensors, HealthKit, microphone, wearable devices (for example, Apple Watch, or brain-computer interfaces), and IoT devices. Examples of ML problems here include:
Motion sensor data classification for activity recognition
Speech recognition and synthesis
Music recognition and synthesis
Biological signals (ECG, EEG, and hand tremor) analysis
We will build a motion recognition app in Chapter 3, K-Nearest Neighbors Classifier.
Everything related to images and videos falls into this category. We will develop some computer vision apps in Chapter 9, Convolutional Neural Networks. Examples of computer vision tasks are:
Optical character recognition
(
OCR
) and handwritten input
Face detection and recognition
Image and video captioning
Image segmentation
3D-scene reconstruction
Generative art (artistic style transfer, Deep Dream, and so on)
NLP is a branch of knowledge at the intersection of linguistics, computer science, and statistics. We'll talk about most common NLP techniques in Chapter 10, Natural Language Processing. Applications of NLP include the following:
Automated translation, spelling, grammar, and style correction
Sentiment analysis
Spam detection/filtering
Document categorization
Chatbots and question answering systems
You can come up with many more applications that are hard to categorize. ML can be done on virtually any data if you have enough of it. Some peculiar data types are:
Spatial data: GPS location (
Chapter 4
,
K-Means
Clustering
), coordinates of UI objects and touches
Tree-like structures: hierarchy of folders and files
Network-like data: occurrences of people together in your photos, or hyperlinks between web pages
Application logs and user in-app activity data (
Chapter 5
,
Association Rule Learning
)
System data: free space disk, battery level, and similar
Survey results
As we know from press reports, Apple uses ML for fraud detection, and to mine useful data from beta testing reports; however, these are not examples visible on our mobile devices. Your iPhone itself has a handful of ML models built into its operating system, and some native apps helping to perform a wide range of tasks. Some use cases are well known and prominent while others are inconspicuous. The most obvious examples are Siri speech recognition, natural language understanding, and voice generation. Camera app uses face detection for focusing and Photos app uses face recognition to group photos with the same person into one album. Presenting the new iOS 10 in June 2016, Craig Federighi mentioned its predictive keyboard, which uses an LSTM algorithm (a type of recurrent neural network) to suggest the next word from the context, and also how Photos uses deep learning to recognize objects and classify scenes. iOS itself uses ML to extend battery life, provide contextual suggestions, match profiles from social networks and mail with the records in Contacts, and to choose between internet connection options. On Apple Watch, ML models are employed to recognize user motion activity types and handwritten input.
Prior to iOS 10, Apple provided some ML APIs like speech or movement recognition, but only as black boxes, without the possibility to tune the models or to reuse them for other purposes. If you wanted to do something slightly different, like detect the type of motion (which is not predefined by Apple), you had to build your own models from scratch. In iOS 10, CNN building blocks were added in the two frameworks at once: as a part of Metal API, and as a sublibrary of an Accelerate framework. Also, the first actual ML algorithm was introduced to iOS SDK: the decision tree learner in the GameplayKit.
ML capabilities continued to expand with the release of iOS 11. At the WWDC 2017, Apple presented the Core ML framework. It includes API for running pre-trained models and is accompanied by tools for converting models trained with some popular ML frameworks to Apple's own format. Still, for now it doesn't provide the possibility of training models on a device, so your models can't be changed or updated in runtime.
Looking in the App Store for the terms artificial intelligence, deep learning, ML , and similar, you'll find a lot of applications, some of them quite successful. Here are several examples:
Google Translate is doing speech recognition and synthesis, OCR, handwriting recognition, and automated translation; some of this is done offline, and some online.
Duolingo validates pronunciation, recommends optimal study materials, and employs Chatbots for language study.
Prisma, Artisto, and others turn photos into paintings using a neural artistic style transfer algorithm. Snapchat and Fabby use image segmentation, object tracking, and other computer vision techniques to enhance selfies. There are also applications for coloring black and white photos automatically.
Snapchat's video selfie filters use ML for real-time face tracking and modification.
Aipoly Vision helps blind people, saying aloud what it sees through the camera.
Several calorie counter apps recognize food through a camera. There are also similar apps to identify dog breeds, trees and trademarks.
Tens of AI personal assistants and Chatbots, with different capabilities from cow disease diagnostics, to matchmaking and stock trading.
Predictive keyboards, spellcheckers, and auto correction, for instance, SwiftKey.
Games that learn from their users and games with evolving characters/units.
There are also news, mail, and other apps that adapt to users' habits and preferences using ML .
Brain-computer interfaces and fitness wearables with the help of ML recognize different user conditions like concentration, sleep phases, and so on. At least some of their supplementary mobile apps do ML .
Medical diagnostic and monitoring through mobile health applications. For example, OneRing monitors Parkinson's disease using the data from a wearable device.
All these applications are built upon the extensive data collection and processing. Even if the application itself is not collecting the data, the model it uses was trained on some usually big dataset. In the following section, we will discuss all things related to data in ML applications.
For many years, researchers argued about what is more important: data or algorithms. But now, it looks like the importance of data over algorithms is generally accepted among ML specialists. In most cases, we can assume that the one who has better data usually beats those with more advanced algorithms. Garbage in, garbage out—this rule holds true in ML more than anywhere else. To succeed in this domain, one need not only have data, but also needs to know his data and know what to do with it.
ML datasets are usually composed from individual observations, called samples, cases, or data points. In the simplest case, each sample has several features.
When we are talking about features in the context of ML , what we mean is some characteristic property of the object or phenomenon we are investigating.
Features are used to distinguish objects from each other and to measure the similarity between them.
For instance:
If the objects of our interest are books, features could be a title, page count, author's name, a year of publication, genre, and so on
If the objects of interest are images, features could be intensities of each pixel
If the objects are blog posts, features could be language, length, or presence of some terms
Table 1.1: an example of a ML dataset (dummy books):
Title
Author's name
Pages
Year
Genre
Average readers review score
Publisher
In stock
Learn ML in 21 Days
Machine Learner
354
2018
Sci-Fi
3.9
Untitled United
False
101 Tips to Survive an Asteroid Impact
Enrique Drills
124
2021
Self-help
4.7
Vacuum Books
True
Sleeping on the Keyboard
Jessica's Cat
458
2014
Non-fiction
3.5
JhGJgh Inc.
True
Quantum Screwdriver: Heritage
Yessenia Purnima
1550
2018
Sci-Fi
4.2
Vacuum Books
True
In the books example, you can see several types of features:
Categorical or unordered
: Title, author, genre, publisher. They are similar to enumeration without raw values in Swift, but with one difference: they have levels instead of cases. Important: you can't order them or say that one is bigger than another.
Binary
: The presence or absence of something, just true or false. In our case, the
In stock
feature.
Real numbers
: Page count, year, average reader's review score. These can be represented as float or double.
There are others, but these are by far the most common.
The most common ML algorithms require the dataset to consist of a number of samples, where each sample is represented by a vector of real numbers (feature vector), and all samples have the same number of features. The simplest (but not the best) way of translating categorical features into real numbers is by replacing them with numerical codes (Table 1.2).
Table 1.2: dummy books dataset after simple preprocessing:
0.0
0.0
354.0
2018.0
0.0
3.9
0.0
0.0
1.0
1.0
124.0
2021.0
1.0
4.7
1.0
1.0
2.0
2.0
458.0
2014.0
2.0
3.5
2.0
1.0
3.0
3.0
1550.0
2018.0
0.0
4.2
1.0
1.0
This is an example of how your dataset may look before you feed it into your ML algorithm. Later, we will discuss the nuts and bolts of data preprocessing for specific applications.
For ML purposes, it's necessary to choose a reasonable set of features, not too many and not too few:
If you have too few features, this information may be not sufficient for your model to achieve the required quality. In this case, you want to construct new ones from existing features, or extract more features from the raw data.
If you have too many features you want to select only the most informative and discriminative, because the more features you have the more complex your computations become.
How do you tell which features are most important? Sometimes common sense helps. For example, if you are building a model that recommends books for you, the genre and average rating of the book are perhaps more important features than the number of pages and year of publication. But what if your features are just pixels of a picture and you're building a face recognition system? For a black and white image of size 1024 x 768, we'd get 786,432 features. Which pixels are most important? In this case, you have to apply some algorithms to extract meaningful features. For example, in computer vision, edges, corners, and blobs are more informative features then raw pixels, so there are plenty of algorithms to extract them (Figure 1.1). By passing your image through some filters, you can get rid of unimportant information and reduce the number of features significantly; from hundreds of thousands to hundreds, or even tens. The techniques that helps to select the most important subset of features is known as feature selection, while the feature extraction techniques result in the creation of new features:
Feature extraction, selection, and combining is a kind of the art which is known as feature engineering. This requires not only hacking and statistical skills but also domain knowledge. We will see some feature engineering techniques while working on practical applications in the following chapters. We also will step into the exciting world of deep learning: a technique that gives a computer the ability to extract high-level abstract features from the low-level features.
The number of features you have for each sample (or length of feature vector) is usually referred to as the dimensionality of the problem. Many problems are high-dimensional, with hundreds or even thousands of features. Even worse, some of those problems are sparse; that is, for each data point, most of the features are zero or missed. This is a common situation in recommender systems. For instance, imagine yourself building the dataset of movie ratings: the rows are movies and columns are users, and in each cell, you have a rating given by the user of the movie. The majority of the cells in the table will remain empty, as most of the users will never have watched most of the movies. The opposite situation is called dense, which is when most values are in place. Many problems in natural language processing and bioinformatics are high-dimensional, sparse, or both.
Feature selection and extraction help to decrease the number of features without significant loss of information, so we also call them dimensionality reduction algorithms.
Datasets can be obtained from different sources. The ones important for us are:
Classical datasets such as Iris (botanical measurements of flowers composed by R. Fisher in 1936), MNIST (60,000 handwritten digits published in 1998), Titanic (personal information of Titanic passengers from Encyclopedia Titanica and other sources), and others. Many classical datasets are available as part of Python and R ML packages. They represent some classical types of ML tasks and are useful for demonstrations of algorithms. Meanwhile, there is no similar library for Swift. Implementation of such a library would be straightforward and is a low-hanging fruit for anyone who wants to get some stars on GitHub.
Open and commercial dataset repositories. Many institutions release their data for everyone's needs under different licenses. You can use such data for training production models or while collecting your own dataset.
Some public dataset repositories include:
The UCI ML repository: https://archive.ics.uci.edu/ml/datasets.html
Kaggle datasets: https://www.kaggle.com/datasets
data.world, a social network for dataset sharing:https://data.world
Data collection (acquisition)
is required if no existing data can help you to solve your problem.
This approach can be costly both in resources and time if you have to collect the data ad hoc; however, in many cases, you have data as a byproduct of some other process, and you can compose your dataset by extracting useful information from the data. For example, text corpuses can be composed by crawling Wikipedia or news sites. iOS automatically collects some useful data. HealthKit is a unified database of users' health measurements. Core Motion allows getting historical data on user's motion activities. The ResearchKit framework provides standardized routines to assess the user's health conditions. The CareKit framework standardizes the polls. Also, in some cases, useful information can be obtained from app log mining.