Machine Learning with Swift - Alexander Sosnovshchenko - E-Book

Machine Learning with Swift E-Book

Alexander Sosnovshchenko

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Machine learning as a field promises to bring increased intelligence to the software by helping us learn and analyse information efficiently and discover certain patterns that humans cannot. This book will be your guide as you embark on an exciting journey in machine learning using the popular Swift language.
We’ll start with machine learning basics in the first part of the book to develop a lasting intuition about fundamental machine learning concepts. We explore various supervised and unsupervised statistical learning techniques and how to implement them in Swift, while the third section walks you through deep learning techniques with the help of typical real-world cases. In the last section, we will dive into some hard core topics such as model compression, GPU acceleration and provide some recommendations to avoid common mistakes during machine learning application development.
By the end of the book, you'll be able to develop intelligent applications written in Swift that can learn for themselves.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 383

Veröffentlichungsjahr: 2018

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Machine Learning with Swift

 

 

 

 

 

 

 

 

 

 

 

 

Artificial Intelligence for iOS

 

 

 

 

 

 

 

 

 

 

 

 

Alexander Sosnovshchenko

 

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

Machine Learning with Swift

 

Copyright © 2018 Packt Publishing

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Commissioning Editor: Veena PagareAcquisition Editor: Vinay ArgekarContent Development Editor: Mayur PawanikarTechnical Editor: Dinesh PawarCopy Editor: Vikrant Phadkay, Safis EditingProject Coordinator: Nidhi JoshiProofreader: Safis EditingIndexer: Pratik ShirodkarGraphics: Tania DuttaProduction Coordinator: Arvindkumar Gupta

First published: February 2018

Production reference: 1270218

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78712-151-5

www.packtpub.com

mapt.io

Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Mapt is fully searchable

Copy and paste, print, and bookmark content

PacktPub.com

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors

About the author

Alexander Sosnovshchenko has been working as an iOS software engineer since 2012. Later he made his foray into data science, from the first experiments with mobile machine learning in 2014, to complex deep learning solutions for detecting anomalies in video surveillance data. He lives in Lviv, Ukraine, and has a wife and a daughter.

Thanks to Dmitrii Vorona for moral support, invaluable advice, and code reviews; Nikolay Sosnovshchenko and Oksana Matskovich for the help with pictures of creatures and androids; David Kopec and Matthijs Hollemans for their open source projects; Mr. Jojo Moolayil for his efforts and expertise as a contributing author and reviewer; and my family for being supportive and patient.

About the reviewers

Jojo Moolayil is an artificial intelligence, deep learning, and machine learning professional with over 5 years of experience and is the author of Smarter Decisions – The Intersection of Internet of Things and Decision Science. He works with GE and lives in Bengaluru, India. He has also been a technical reviewer about various books in machine learning, deep learning, and business analytics with Apress and Packt. 

I would like to thank my family, friends, and mentors.

 

 

 

 

Cecil Costa, also known as Eduardo Campos in Latin American countries, is a Euro-Brazilian freelance developer who has been learning about computers since he got his first PC in 1990. Learning is his passion, and so is teaching; this is why he works as a trainer. He has organized both on-site and online courses for companies. He is also the author of a few Swift books.

I’d like to thank Maximilian Ambergis for creating the delete key; it has been very useful for me!

 

 

 

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Machine Learning with Swift

Packt Upsell

Why subscribe?

PacktPub.com

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Getting Started with Machine Learning

What is AI?

The motivation behind ML

What is ML ?

Applications of ML

Digital signal processing (DSP)

Computer vision

Natural language processing (NLP)

Other applications of ML

Using ML to build smarter iOS applications

Getting to know your data

Features

Types of features

Choosing a good set of features

Getting the dataset

Data preprocessing

Choosing a model

Types of ML algorithms

Supervised learning

Unsupervised learning

Reinforcement learning

Mathematical optimization – how learning works

Mobile versus server-side ML

Understanding mobile platform limitations

Summary

Bibliography

Classification – Decision Tree Learning

Machine learning toolbox

Prototyping the first machine learning app

Tools

Setting up a machine learning environment

IPython notebook crash course

Time to practice

Machine learning for extra-terrestrial life explorers

Loading the dataset

Exploratory data analysis

Data preprocessing

Converting categorical variables

Separating features from labels

One-hot encoding

Splitting the data

Decision trees everywhere

Training the decision tree classifier

Tree visualization

Making predictions

Evaluating accuracy

Tuning hyperparameters

Understanding model capacity trade-offs

How decision tree learning works

Building a tree automatically from data

Combinatorial entropy

Evaluating performance of the model with data

Precision, recall, and F1-score

K-fold cross-validation

Confusion matrix

Implementing first machine learning app in Swift

Introducing Core ML

Core ML features

Exporting the model for iOS

Ensemble learning random forest

Training the random forest

Random forest accuracy evaluation

Importing the Core ML model into an iOS project

Evaluating performance of the model on iOS

Calculating the confusion matrix

Decision tree learning pros and cons

Summary

K-Nearest Neighbors Classifier

Calculating the distance

DTW

Implementing DTW in Swift

Using instance-based models for classification and clustering

People motion recognition using inertial sensors

Understanding the KNN algorithm

Implementing KNN in Swift

Recognizing human motion using KNN

Cold start problem

Balanced dataset

Choosing a good k

Reasoning in high-dimensional spaces

KNN pros

KNN cons

Improving our solution

Probabilistic interpretation

More data sources

Smarter time series chunking

Hardware acceleration

Trees to speed up the inference

Utilizing state transitions

Summary

Bibliography

K-Means Clustering

Unsupervised learning

K-means clustering

Implementing k-means in Swift

Update step

Assignment step

Clustering objects on a map

Choosing the number of clusters

K-means clustering – problems

K-means++

Image segmentation using k-means

Summary

Association Rule Learning

Seeing association rules

Defining data structures

Using association measures to assess rules

Supporting association measures

Confidence association measures

Lift association measures

Conviction association measures

Decomposing the problem

Generating all possible rules

Finding frequent item sets

The Apriori algorithm

Implementing Apriori in Swift

Running Apriori

Running Apriori on real-world data

The pros and cons of Apriori

Building an adaptable user experience

Summary

Bibliography

Linear Regression and Gradient Descent

Understanding the regression task

Introducing simple linear regression

Fitting a regression line using the least squares method

Where to use GD and normal equation

Using gradient descent for function minimization

Forecasting the future with simple linear regression

Feature scaling

Feature standardization

Multiple linear regression

Implementing multiple linear regression in Swift

Gradient descent for multiple linear regression

Training multiple regression

Linear algebra operations

Feature-wise standardization

Normal equation for multiple linear regression

Understanding and overcoming the limitations of linear regression

Fixing linear regression problems with regularization

Ridge regression and Tikhonov regularization

LASSO regression

ElasticNet regression

Summary

Bibliography

Linear Classifier and Logistic Regression

Revisiting the classification task

Linear classifier

Logistic regression

Implementing logistic regression in Swift

The prediction part of logistic regression

Training the logistic regression

Cost function

Predicting user intents

Handling dates

Choosing the regression model for your problem

Bias-variance trade-off

Summary

Neural Networks

What are artificial NNs anyway?

Building the neuron

Non-linearity function

Step-like activation functions

Rectifier-like activation functions

Building the network

Building a neural layer in Swift

Using neurons to build logical functions

Implementing layers in Swift

Training the network

Vanishing gradient problem

Seeing biological analogies

Basic neural network subroutines (BNNS)

BNNS example

Summary

Convolutional Neural Networks

Understanding users emotions

Introducing computer vision problems

Introducing convolutional neural networks

Pooling operation

Convolution operation

Convolutions in CNNs

Building the network

Input layer

Convolutional layer

Fully-connected layers

Nonlinearity layers

Pooling layer

Regularization layers

Dropout

Batch normalization

Loss functions

Training the network

Training the CNN for facial expression recognition

Environment setup

Deep learning frameworks

Keras

Loading the data

Splitting the data

Data augmentation

Creating the network

Plotting the network structure

Training the network

Plotting loss

Making predictions

Saving the model in HDF5 format

Converting to Core ML format

Visualizing convolution filters

Deploying CNN to iOS

Summary

Bibliography

Natural Language Processing

NLP in the mobile development world

Word Association game

Python NLP libraries

Textual corpuses

Common NLP approaches and subtasks

Tokenization

Stemming

Lemmatization

Part-of-speech (POS) tagging

Named entity recognition (NER)

Removing stop words and punctuation

Distributional semantics hypothesis

Word vector representations

Autoencoder neural networks

Word2Vec

Word2Vec in Gensim

Vector space properties

iOS application

Chatbot anatomy

Voice input

NSLinguisticTagger and friends

Word2Vec on iOS

Text-to-speech output

UIReferenceLibraryViewController

Putting it all together

Word2Vec friends and relatives

Where to go from here?

Summary

Machine Learning Libraries

Machine learning and AI APIs

Libraries

General-purpose machine learning libraries

AIToolbox

BrainCore

Caffe

Caffe2

dlib

FANN

LearnKit

MLKit

Multilinear-math

MXNet

Shark

TensorFlow

tiny-dnn

Torch

YCML

Inference-only libraries

Keras

LibSVM

Scikit-learn

XGBoost

NLP libraries

Word2Vec

Twitter text

Speech recognition

TLSphinx

OpenEars

Computer vision

OpenCV

ccv

OpenFace

Tesseract

Low-level subroutine libraries

Eigen

fmincg-c

IntuneFeatures

SigmaSwiftStatistics

STEM

Swix

LibXtract

libLBFGS

NNPACK

Upsurge

YCMatrix

Choosing a deep learning framework

Summary

Optimizing Neural Networks for Mobile Devices

Delivering perfect user experience

Calculating the size of a convolutional neural network

Lossless compression

Compact CNN architectures

SqueezeNet

MobileNets

ShuffleNet

CondenseNet

Preventing a neural network from growing big

Lossy compression

Optimizing for inference

Network pruning

Weights quantization

Reducing precision

Other approaches

Facebook's approach in Caffe2

Knowledge distillation

Tools

An example of the network compression

Summary

Bibliography

Best Practices

Mobile machine learning project life cycle

Preparatory stage

Formulate the problem

Define the constraints

Research the existing approaches

Research the data

Make design choices

Prototype creation

Data preprocessing

Model training, evaluation, and selection

Field testing

Porting or deployment for a mobile platform

Production

Best practices

Benchmarking

Privacy and differential privacy

Debugging and visualization

Documentation

Machine learning gremlins

Data kobolds

Tough data

Biased data

Batch effects

Goblins of training

Product design ogres

Magical thinking

Cargo cult

Feedback loops

Uncanny valley effect

Recommended learning resources

Mathematical background

Machine learning

Computer vision

NLP

Summary

Preface

Machine learning, as a field, promises to bring increasing intelligence to software by helping us learn and analyze information efficiently and discover certain things that humans cannot. We'll start by developing lasting intuition about the fundamental machine learning concepts in the first section. We'll explore various supervised and unsupervised learning techniques in the second section. Then, the third section, will walk you through deep learning techniques with the help of common real-world cases. In the last section, we'll dive into hardcore topics such as model compression and GPU acceleration, and provide some recommendations to avoid common mistakes during machine learning application development. By the end of the book, you'll be able to develop intelligent applications written in Swift that can learn for themselves.

Who this book is for

This book is for iOS developers who wish to create intelligent iOS applications, and data science professionals who are interested in performing machine learning using Swift. Familiarity with some basic Swift programming is all you need to get started with this book.

What this book covers

Chapter 1, Getting Started with Machine Learning, teaches the main concepts of machine learning.

Chapter 2, Classification – Decision Tree Learning, builds our first machine learning application.

Chapter 3, K-Nearest Neighbors Classifier, continues exploring classification algorithms, and we learn about instance-based learning algorithms.

Chapter 4, K-Means Clustering, continues with instance-based algorithms, this time focusing on an unsupervised clustering task.

Chapter 5, Association Rule Learning, explores unsupervised learning more deeply. 

Chapter 6, Linear Regression and Gradient Descent, returns to supervised learning, but this time we switch our attention from non-parametric models, such as KNN and k-means, to parametric linear models.

 Chapter 7, Linear Classifier and Logistic Regression, continues by building different, more complex models on top of linear regression: polynomial regression, regularized regression, and logistic regression.

Chapter 8, Neural Networks, implements our first neural network.

Chapter 9, Convolutional Neural Networks, continues NNs, but this time we focus on convolutional NNs, which are especially popular in the computer vision domain.

Chapter 10, Natural Language Processing, explores the amazing world of human natural language. We're also going to use neural networks to build several chatbots with different personalities.

Chapter 11, Machine Learning Libraries, overviews existing iOS-compatible libraries for machine learning. 

Chapter 12, Optimizing Neural Networks for Mobile Devices, talks about deep neural network deployment on mobile platforms.

Chapter 13, Best Practices, discusses a machine learning app's life cycle, common problems in AI projects, and how to solve them. 

To get the most out of this book

You will need the following software to be able to smoothly sail through this book:

Homebrew 1.3.8 +

Python 2.7.x

pip 9.0.1+

Virtualenv 15.1.0+

IPython 5.4.1+

Jupyter 1.0.0+

SciPy 0.19.1+

NumPy 1.13.3+

Pandas 0.20.2+

Matplotlib 2.0.2+

Graphviz 0.8.2+

pydotplus 2.0.2+

scikit-learn 0.18.1+

coremltools 0.6.3+

Ruby (default macOS version)

Xcode 9.2+

Keras 2.0.6+ with TensorFlow 1.1.0+ backend

keras-vis 0.4.1+

NumPy 1.13.3+

NLTK 3.2.4+

Gensim 2.1.0+

OS required:

macOS High Sierra 10.13.3+

iOS 11+ or simulator

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

Log in or register at

www.packtpub.com

.

Select the

SUPPORT

tab.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR/7-Zip for Windows

Zipeg/iZip/UnRarX for Mac

7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub athttps://github.com/PacktPublishing/Machine-Learning-with-Swift. In case there's an update to the code, it will be updated on the existing GitHub repository. The author has also hosted the code bundle on his GitHub repository at: https://github.com/alexsosn/SwiftMLBook.

We also have other code bundles from our rich catalog of books and videos available athttps://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://www.packtpub.com/sites/default/files/downloads/MachineLearningwithSwift_ColorImages.pdf.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email [email protected] and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

Getting Started with Machine Learning

We live in exciting times. Artificial intelligence (AI) and Machine Learning  (ML) went from obscure mathematical and science fiction topics to become a part of mass culture. Google, Facebook, Microsoft, and others competed to become the first to give the world general AI. In November 2015, Google open sourced its ML framework with TensorFlow, which is suitable for running on supercomputers as well as smartphones, and since then has won a broad community. Shortly afterwards, other big companies followed the example. The best iOS app of 2016 (Apple Choice), viral photo editor Prisma owes its success entirely to a particular kind of ML algorithm: convolutional neural network (CNN). These systems were invented back in the nineties but became popular only in the noughties. Mobile devices only gained enough computational power to run them in 2014/2015. In fact, artificial neural networks became so important for practical applications that in iOS 10 Apple added native support for them in the metal and accelerate frameworks. Apple also opened Siri to third-party developers and introduced GameplayKit, a framework to add AI capabilities to your computer games. In iOS 11, Apple introduced Core ML, a framework for running pre-trained models on vendors' devices, and Vision framework for common computer vision tasks.

The best time to start learning about ML was 10 years ago. The next best time is right now.

In this chapter, we will cover the following topics:

Understanding what AI and ML is

Fundamental concepts of ML : model, dataset, and learning

Types of ML tasks

ML project life cycle

General purpose ML versus mobile ML

What is AI?

"What I cannot create, I do not understand."
– Richard Feynman

AI is a field of knowledge about building intelligent machines, whatever meaning you assign to the word intelligence. There are two different AI notions among researchers: strong AI and weak AI.

Strong AI, or artificial general intelligence (AGI), is a machine that is fully capable of imitating human-level intelligence, including consciousness, feelings, and mind. Presumably, it should be able to apply successfully its intelligence to any tasks. This type of AI is like a horizon—we always see it as a goal but we are still not there, despite all our struggles. The significant role here plays the AI effect: the things that were yesterday considered a feature of strong AI are today accepted as granted and trivial. In the sixties, people believed that playing board games like chess was a characteristic of strong AI. Today, we have programs that outperform the best human chess players, but we are still far from strong AI. Our iPhones are probably an AI from the eighties perspective: you can talk to them, and they can answer your questions and deliver information on any topic in just seconds. So, keeping strong AI as a distant goal, researchers focused on things at hand and called them weak AI: systems that have some features of intelligence, and can be applied to some narrow tasks. Among those tasks are automated reasoning, planning, creativity, communication with humans, a perception of its surrounding world, robotics, and emotions simulation. We will touch some of these tasks in this book, but mostly we will focus on ML because this domain of AI has found a lot of practical applications on mobile platforms in the recent years.

The motivation behind ML

Let's start with an analogy. There are two ways of learning an unfamiliar language:

Learning the language rules by heart, using textbooks, dictionaries, and so on. That's how college students usually do it.

Observing live language: by communicating with native speakers, reading books, and watching movies. That's how children do it.

In both cases, you build in your mind the language model, or, as some prefer to say, develop a sense of language.

In the first case, you are trying to build a logical system based on rules. In this case, you will encounter many problems: the exceptions to the rule, different dialects, borrowing from other languages, idioms, and lots more. Someone else, not you, derived and described for you the rules and structure of the language.

In the second case, you derive the same rules from the available data. You may not even be aware of the existence of these rules, but gradually adjust yourself to the hidden structure and understand the laws. You use your special brain cells called mirror neurons, trying to mimic native speakers. This ability is honed by millions of years of evolution. After some time, when facing the wrong word usage, you just feel that something is wrong but you can't tell immediately what exactly.

In any case, the next step is to apply the resulting language model in the real world. Results may differ. In the first case, you will experience difficulty every time you find the missing hyphen or comma, but may be able to get a job as a proofreader at a publishing house. In the second case, everything will depend on the quality, diversity, and amount of the data on which you were trained. Just imagine a person in the center of New York who studied English through Shakespeare. Would he be able to have a normal conversation with people around him?

Now we'll put the computer in place of the person in our example. Two approaches, in this case, represent the two programming techniques. The first one corresponds to writing ad hoc algorithms consisting of conditions, cycles, and so on, by which a programmer expresses rules and structures. The second one represents ML , in which case the computer itself identifies the underlying structure and rules based on the available data.

The analogy is deeper than it seems at first glance. For many tasks, building the algorithms directly is impossibly hard because of the variability in the real world. It may require the work of experts in the domain, who must describe all rules and edge cases explicitly. Resulting models can be fragile and rigid. On the other hand, this same task can be solved by allowing computers to figure out the rules on their own from a reasonable amount of data. An example of such a task is face recognition. It's virtually impossible to formalize face recognition in terms of conventional imperative algorithms and data structures. Only recently, the task was successfully solved with the help of ML .

What is ML ?

ML  is a subdomain of AI that has demonstrated significant progress over the last decade, and remains a hot research topic. It is a branch of knowledge concerned with building algorithms that can learn from data and improve themselves with regards to the tasks they perform. ML allows computers to deduce the algorithm for some task or to extract hidden patterns from data. ML is known by several different names in different research communities: predictive analytics, data mining, statistical learning, pattern recognition, and so on. One can argue that these terms have some subtle differences, but essentially, they all overlap to the extent that you can use the terminology interchangeably.

Abbreviation ML may refer to many things outside of the AI domain; for example, there is a functional programming language of this name. Nevertheless, the abbreviation is widely used in the names of libraries and conferences as referring to ML . Throughout this book, we also use it in this way.

ML is already everywhere around us. Search engines, targeted ads, face and voice recognition, recommender systems, spam filtration, self-driven cars, fraud detection in bank systems, credit scoring, automated video captioning, and machine translation—all these things are impossible to imagine without ML these days.

Over recent years, ML has owed its success to several factors:

The abundance of data in different forms (big data)

Accessible computational power and specialized hardware (clouds and GPUs)

The rise of open source and open access

Algorithmic advances

Any ML system includes three essential components: data, model, and task. The data is something you provide as an input to your model. A model is a type of mathematical function or computer program that performs the task. For instance, your emails are data, the spam filter is a model, and telling spam apart from non-spam is a task. The learning in ML stands for a process of adjusting your model to the data so that the model becomes better at its task. The obvious consequences of this setup is expressed in the piece of wisdom well-known among statisticians, "Your model is only as good as your data".

Applications of ML

There are many domains where ML is an indispensable ingredient, some of them are robotics, bioinformatics, and recommender systems. While nothing prevents you from writing bioinformatic software in Swift for macOS or Linux, we will restrict our practical examples in this book to more mobile-friendly domains. The apparent reason for this is that currently, iOS remains the primary target platform for most of the programmers who use Swift on a day-to-day basis.

For the sake of convenience, we'll roughly divide all ML applications of interest for mobile developers into three plus one areas, according to the datatypes they deal with most commonly:

Digital signal processing (sensor data, audio)

Computer vision (images, video)

Natural language processing (texts, speech)

Other applications and datatypes

Digital signal processing (DSP)

This category includes tasks where input data types are signals, time series, and audio. The sources of the data are sensors, HealthKit, microphone, wearable devices (for example, Apple Watch, or brain-computer interfaces), and IoT devices. Examples of ML problems here include:

Motion sensor data classification for activity recognition

Speech recognition and synthesis

Music recognition and synthesis

Biological signals (ECG, EEG, and hand tremor) analysis

We will build a motion recognition app in Chapter 3, K-Nearest Neighbors Classifier.

Strictly speaking, image processing is also a subdomain of DSP but let's not be too meticulous here.

Computer vision

Everything related to images and videos falls into this category. We will develop some computer vision apps in Chapter 9, Convolutional Neural Networks. Examples of computer vision tasks are:

Optical character recognition

(

OCR

) and handwritten input

Face detection and recognition

Image and video captioning

Image segmentation

3D-scene reconstruction

Generative art (artistic style transfer, Deep Dream, and so on)

Natural language processing (NLP)

NLP is a branch of knowledge at the intersection of linguistics, computer science, and statistics. We'll talk about most common NLP techniques in Chapter 10, Natural Language Processing. Applications of NLP include the following:

Automated translation, spelling, grammar, and style correction

Sentiment analysis

Spam detection/filtering

Document categorization

Chatbots and question answering systems

Other applications of ML

You can come up with many more applications that are hard to categorize. ML can be done on virtually any data if you have enough of it. Some peculiar data types are:

Spatial data: GPS location (

Chapter 4

,

K-Means

Clustering

), coordinates of UI objects and touches

Tree-like structures: hierarchy of folders and files

Network-like data: occurrences of people together in your photos, or hyperlinks between web pages

Application logs and user in-app activity data (

Chapter 5

,

 

Association Rule Learning

)

System data: free space disk, battery level, and similar

Survey results

Using ML to build smarter iOS applications

As we know from press reports, Apple uses ML for fraud detection, and to mine useful data from beta testing reports; however, these are not examples visible on our mobile devices. Your iPhone itself has a handful of ML models built into its operating system, and some native apps helping to perform a wide range of tasks. Some use cases are well known and prominent while others are inconspicuous. The most obvious examples are Siri speech recognition, natural language understanding, and voice generation. Camera app uses face detection for focusing and Photos app uses face recognition to group photos with the same person into one album. Presenting the new iOS 10 in June 2016, Craig Federighi mentioned its predictive keyboard, which uses an LSTM algorithm (a type of recurrent neural network) to suggest the next word from the context, and also how Photos uses deep learning to recognize objects and classify scenes. iOS itself uses ML to extend battery life, provide contextual suggestions, match profiles from social networks and mail with the records in Contacts, and to choose between internet connection options. On Apple Watch, ML models are employed to recognize user motion activity types and handwritten input.

Prior to iOS 10, Apple provided some ML APIs like speech or movement recognition, but only as black boxes, without the possibility to tune the models or to reuse them for other purposes. If you wanted to do something slightly different, like detect the type of motion (which is not predefined by Apple), you had to build your own models from scratch. In iOS 10, CNN building blocks were added in the two frameworks at once: as a part of Metal API, and as a sublibrary of an Accelerate framework. Also, the first actual ML algorithm was introduced to iOS SDK: the decision tree learner in the GameplayKit.

ML capabilities continued to expand with the release of iOS 11. At the WWDC 2017, Apple presented the Core ML framework. It includes API for running pre-trained models and is accompanied by tools for converting models trained with some popular ML frameworks to Apple's own format. Still, for now it doesn't provide the possibility of training models on a device, so your models can't be changed or updated in runtime.

Looking in the App Store for the terms artificial intelligence, deep learning, ML , and similar, you'll find a lot of applications, some of them quite successful. Here are several examples:

Google Translate is doing speech recognition and synthesis, OCR, handwriting recognition, and automated translation; some of this is done offline, and some online.

Duolingo validates pronunciation, recommends optimal study materials, and employs Chatbots for language study.

Prisma, Artisto, and others turn photos into paintings using a neural artistic style transfer algorithm. Snapchat and Fabby use image segmentation, object tracking, and other computer vision techniques to enhance selfies. There are also applications for coloring black and white photos automatically.

Snapchat's video selfie filters use ML for real-time face tracking and modification.

Aipoly Vision helps blind people, saying aloud what it sees through the camera.

Several calorie counter apps recognize food through a camera. There are also similar apps to identify dog breeds, trees and trademarks.

Tens of AI personal assistants and Chatbots, with different capabilities from cow disease diagnostics, to matchmaking and stock trading.

Predictive keyboards, spellcheckers, and auto correction, for instance, SwiftKey.

Games that learn from their users and games with evolving characters/units.

There are also news, mail, and other apps that adapt to users' habits and preferences using ML .

Brain-computer interfaces and fitness wearables with the help of ML recognize different user conditions like concentration, sleep phases, and so on. At least some of their supplementary mobile apps do ML .

Medical diagnostic and monitoring through mobile health applications. For example, OneRing monitors Parkinson's disease using the data from a wearable device.

All these applications are built upon the extensive data collection and processing. Even if the application itself is not collecting the data, the model it uses was trained on some usually big dataset. In the following section, we will discuss all things related to data in ML applications.

Getting to know your data

For many years, researchers argued about what is more important: data or algorithms. But now, it looks like the importance of data over algorithms is generally accepted among ML specialists. In most cases, we can assume that the one who has better data usually beats those with more advanced algorithms. Garbage in, garbage out—this rule holds true in ML more than anywhere else. To succeed in this domain, one need not only have data, but also needs to know his data and know what to do with it.

ML datasets are usually composed from individual observations, called samples, cases, or data points. In the simplest case, each sample has several features.

Features

When we are talking about features in the context of ML , what we mean is some characteristic property of the object or phenomenon we are investigating.

Other names for the same concept you'll see in some publications are explanatory variable, independent variable, and predictor.

Features are used to distinguish objects from each other and to measure the similarity between them.

For instance:

If the objects of our interest are books, features could be a title, page count, author's name, a year of publication, genre, and so on

If the objects of interest are images, features could be intensities of each pixel

If the objects are blog posts, features could be language, length, or presence of some terms

It's useful to imagine your data as a spreadsheet table. In this case, each sample (data point) would be a row, and each feature would be a column. For example, Table 1.1 shows a tiny dataset of books consisting of four samples where each has eight features.

Table 1.1: an example of a ML dataset (dummy books):

Title

Author's name

Pages

Year

Genre

Average readers review score

Publisher

In stock

Learn ML in 21 Days

Machine Learner

354

2018

Sci-Fi

3.9

Untitled United

False

101 Tips to Survive an Asteroid Impact

Enrique Drills

124

2021

Self-help

4.7

Vacuum Books

True

Sleeping on the Keyboard

Jessica's Cat

458

2014

Non-fiction

3.5

JhGJgh Inc.

True

Quantum Screwdriver: Heritage

Yessenia Purnima

1550

2018

Sci-Fi

4.2

Vacuum Books

True

Types of features

In the books example, you can see several types of features:

Categorical or unordered

: Title, author, genre, publisher. They are similar to enumeration without raw values in Swift, but with one difference: they have levels instead of cases. Important: you can't order them or say that one is bigger than another.

Binary

: The presence or absence of something, just true or false. In our case, the 

In stock

feature.

Real numbers

: Page count, year, average reader's review score. These can be represented as float or double.

There are others, but these are by far the most common.

The most common ML algorithms require the dataset to consist of a number of samples, where each sample is represented by a vector of real numbers (feature vector), and all samples have the same number of features. The simplest (but not the best) way of translating categorical features into real numbers is by replacing them with numerical codes (Table 1.2).

Table 1.2: dummy books dataset after simple preprocessing:

Title
Author's name
Pages
Year
Genre
Average readers review score
Publisher
In stock

0.0

0.0

354.0

2018.0

0.0

3.9

0.0

0.0

1.0

1.0

124.0

2021.0

1.0

4.7

1.0

1.0

2.0

2.0

458.0

2014.0

2.0

3.5

2.0

1.0

3.0

3.0

1550.0

2018.0

0.0

4.2

1.0

1.0

 

This is an example of how your dataset may look before you feed it into your ML algorithm. Later, we will discuss the nuts and bolts of data preprocessing for specific applications.

Choosing a good set of features

For ML purposes, it's necessary to choose a reasonable set of features, not too many and not too few:

If you have too few features, this information may be not sufficient for your model to achieve the required quality. In this case, you want to construct new ones from existing features, or extract more features from the raw data.

If you have too many features you want to select only the most informative and discriminative, because the more features you have the more complex your computations become.

How do you tell which features are most important? Sometimes common sense helps. For example, if you are building a model that recommends books for you, the genre and average rating of the book are perhaps more important features than the number of pages and year of publication. But what if your features are just pixels of a picture and you're building a face recognition system? For a black and white image of size 1024 x 768, we'd get 786,432 features. Which pixels are most important? In this case, you have to apply some algorithms to extract meaningful features. For example, in computer vision, edges, corners, and blobs are more informative features then raw pixels, so there are plenty of algorithms to extract them (Figure 1.1). By passing your image through some filters, you can get rid of unimportant information and reduce the number of features significantly; from hundreds of thousands to hundreds, or even tens. The techniques that helps to select the most important subset of features is known as feature selection, while the feature extraction techniques result in the creation of new features:

Figure 1.1: Edge detection is a common feature extraction technique in computer vision. You can still recognize the object on the right image, despite it containing significantly less information than the left one.

Feature extraction, selection, and combining is a kind of the art which is known as feature engineering. This requires not only hacking and statistical skills but also domain knowledge. We will see some feature engineering techniques while working on practical applications in the following chapters. We also will step into the exciting world of deep learning: a technique that gives a computer the ability to extract high-level abstract features from the low-level features.

The number of features you have for each sample (or length of feature vector) is usually referred to as the dimensionality of the problem. Many problems are high-dimensional, with hundreds or even thousands of features. Even worse, some of those problems are sparse; that is, for each data point, most of the features are zero or missed. This is a common situation in recommender systems. For instance, imagine yourself building the dataset of movie ratings: the rows are movies and columns are users, and in each cell, you have a rating given by the user of the movie. The majority of the cells in the table will remain empty, as most of the users will never have watched most of the movies. The opposite situation is called dense, which is when most values are in place. Many problems in natural language processing and bioinformatics are high-dimensional, sparse, or both.

Feature selection and extraction help to decrease the number of features without significant loss of information, so we also call them dimensionality reduction algorithms.

Getting the dataset

Datasets can be obtained from different sources. The ones important for us are:

Classical datasets such as Iris (botanical measurements of flowers composed by R. Fisher in 1936), MNIST (60,000 handwritten digits published in 1998), Titanic (personal information of Titanic passengers from Encyclopedia Titanica and other sources), and others. Many classical datasets are available as part of Python and R ML packages. They represent some classical types of ML tasks and are useful for demonstrations of algorithms. Meanwhile, there is no similar library for Swift. Implementation of such a library would be straightforward and is a low-hanging fruit for anyone who wants to get some stars on GitHub.

Open and commercial dataset repositories. Many institutions release their data for everyone's needs under different licenses. You can use such data for training production models or while collecting your own dataset.

Some public dataset repositories include:

The UCI ML repository: https://archive.ics.uci.edu/ml/datasets.html

Kaggle datasets: https://www.kaggle.com/datasets

data.world, a social network for dataset sharing:https://data.world

To find more, visit the list of repositories at KDnuggets: http://www.kdnuggets.com/datasets/index.html. Alternatively, you'll find a list of datasets at Wikipedia: https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research.

Data collection (acquisition)

is required if no existing data can help you to solve your problem.

This approach can be costly both in resources and time if you have to collect the data ad hoc; however, in many cases, you have data as a byproduct of some other process, and you can compose your dataset by extracting useful information from the data. For example, text corpuses can be composed by crawling Wikipedia or news sites. iOS automatically collects some useful data. HealthKit is a unified database of users' health measurements. Core Motion allows getting historical data on user's motion activities. The ResearchKit framework provides standardized routines to assess the user's health conditions. The CareKit framework standardizes the polls. Also, in some cases, useful information can be obtained from app log mining.