31,19 €
Artificial intelligence (AI) is rapidly finding practical applications across a wide variety of industry verticals, and the Internet of Things (IoT) is one of them. Developers are looking for ways to make IoT devices smarter and to make users’ lives easier. With this AI cookbook, you’ll be able to implement smart analytics using IoT data to gain insights, predict outcomes, and make informed decisions, along with covering advanced AI techniques that facilitate analytics and learning in various IoT applications.
Using a recipe-based approach, the book will take you through essential processes such as data collection, data analysis, modeling, statistics and monitoring, and deployment. You’ll use real-life datasets from smart homes, industrial IoT, and smart devices to train and evaluate simple to complex models and make predictions using trained models. Later chapters will take you through the key challenges faced while implementing machine learning, deep learning, and other AI techniques, such as natural language processing (NLP), computer vision, and embedded machine learning for building smart IoT systems. In addition to this, you’ll learn how to deploy models and improve their performance with ease.
By the end of this book, you’ll be able to package and deploy end-to-end AI apps and apply best practice solutions to common IoT problems.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 253
Veröffentlichungsjahr: 2021
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Kunal ParikhPublishing Product Manager: Devika BattikeSenior Editor: David SugarmanContent Development Editor: Athikho Sapuni RishanaTechnical Editor: Manikandan KurupCopy Editor: Safis EditingProject Coordinator: Aishwarya MohanProofreader: Safis EditingIndexer: Rekha NairProduction Designer: Nilesh Mohite
First published: March 2021
Production reference: 1040221
Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.
ISBN 978-1-83898-198-3
www.packt.com
Contributors
Michael Roshak is a cloud architect and strategist who has gained extensive subject matter expertise in enterprise cloud transformation programs and infrastructure modernization through designing and deploying cloud-oriented solutions and architectures. He is responsible for providing strategic advisory services for cloud adoption, consultative technical sales, and driving broad cloud services consumption with highly strategic accounts across multiple industries.
Va Barbosa is a software engineer with the Qiskit Community at IBM, focused on building open source tools and creating educational content for developers, researchers, students, and educators in the field of quantum computing. Previously, Va was a developer advocate with the Center for Open Source Data and AI Technologies, where he helped developers to discover and make use of data science and machine learning technologies. He is fueled by his passion to help others and guided by his enthusiasm for open source technology.
Title Page
Copyright and Credits
Artificial Intelligence for IoT Cookbook
Contributors
About the author
About the reviewer
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Get in touch
Reviews
Setting Up the IoT and AI Environment
Choosing a device
Dev kits
Manifold 2-C with NVIDIA TX2
The i.MX series
LattePanda
Raspberry Pi Class
Arduino
ESP8266
Setting up Databricks
Storing data
Parquet
Avro
Delta Lake
Setting up IoT Hub
Getting ready
How to do it...
How it works...
Setting up an IoT Edge device
Getting ready
How to do it...
Configuring an IoT Edge device (cloud side)
Configuring an IoT Edge device (device side)
How it works...
Deploying ML modules to Edge devices
Getting ready
How to do it...
How it works...
There's more...
Setting up Kafka
Getting ready
How to do it...
How it works...
There's more...
Installing ML libraries on Databricks
Getting ready
How to do it...
Importing TensorFlow
Installing PyTorch
Installing GraphX and GraphFrames
How it works...
Handling Data
Storing data for analysis using Delta Lake
Getting ready
How to do it...
How it works...
Data collection design
Getting ready
How to do it...
Variance
Z-Spikes
Min/max
Windowing
Getting ready
How to do it...
Tumbling
Hopping
Sliding
How it works...
Exploratory factor analysis
Getting ready
How to do it...
Visual exploration
Chart types
Redundant sensors
Sample co-variance and correlation
How it works...
There's more...
Implementing analytic queries in Mongo/hot path storage
Getting ready
How to do it...
How it works...
Ingesting IoT data into Spark
Getting ready
How to do it...
How it works...
Machine Learning for IoT
Analyzing chemical sensors with anomaly detection
Getting ready
How to do it...
How it works...
There's more...
Logistic regression with the IoMT
Getting ready
How to do it...
How it works...
There's more...
Classifying chemical sensors with decision trees
How to do it...
How it works...
There's more...
Simple predictive maintenance with XGBoost
Getting ready
How to do it...
How it works...
Detecting unsafe drivers
Getting ready
How to do it...
How it works...
There's more...
Face detection on constrained devices
Getting ready
How to do it...
How it works...
Deep Learning for Predictive Maintenance
Enhancing data using feature engineering
Getting ready
How to do it...
How it works...
There's more...
Using keras for fall detection
Getting ready
How to do it...
How it works...
There's more...
Implementing LSTM to predict device failure
Getting ready
How to do it...
How it works...
Deploying models to web services
Getting ready
How to do it...
How it works...
There's more...
Anomaly Detection
Using Z-Spikes on a Raspberry Pi and Sense HAT
Getting ready
How to do it...
How it works...
Using autoencoders to detect anomalies in labeled data
Getting ready
How to do it...
How it works...
There's more...
Using isolated forest for unlabeled datasets
Getting ready
How to do it...
How it works...
There's more...
Detecting time series anomalies with Luminol
Getting ready
How to do it...
How it works...
There's more...
Detecting seasonality-adjusted anomalies
Getting ready
How to do it...
How it works...
Detecting spikes with streaming analytics
Getting ready
How to do it...
How it works...
Detecting anomalies on the edge
Getting ready
How to do it...
How it works...
Computer Vision
Connecting cameras through OpenCV
Getting ready
How to do it...
How it works...
There's more...
Using Microsoft's custom vision to train and label your images
Getting ready
How to do it...
How it works...
Detecting faces with deep neural nets and Caffe
Getting ready
How to do it...
How it works...
Detecting objects using YOLO on Raspberry Pi 4
Getting ready
How to do it...
How it works...
Detecting objects using GPUs on NVIDIA Jetson Nano
Getting ready
How to do it...
How it works...
There's more...
Training vision with PyTorch on GPUs
Getting ready
How to do it...
How it works...
There's more...
NLP and Bots for Self-Ordering Kiosks
Wake word detection
Getting ready
How to do it...
How it works...
There's more...
Speech-to-text using the Microsoft Speech API
Getting ready
How to do it...
How it works...
Getting started with LUIS
Getting ready
How to do it...
How it works...
There's more...
Implementing smart bots
Getting ready
How to do it...
How it works...
There's more...
Creating a custom voice
Getting ready
How to do it...
How it works...
Enhancing bots with QnA Maker
Getting ready
How to do it...
How it works...
There's more...
Optimizing with Microcontrollers and Pipelines
Introduction to ESP32 with IoT
Getting ready
How to do it...
How it works...
There's more...
Implementing an ESP32 environment monitor
Getting ready
How to do it...
How it works...
There's more...
Optimizing hyperparameters
Getting ready
How to do it...
How it works...
Dealing with BOM changes
Getting ready
How to do it...
How it works...
There's more...
Building machine learning pipelines with sklearn
Getting ready
How to do it...
How it works...
There's more...
Streaming machine learning with Spark and Kafka
Getting ready
How to do it...
How it works...
There's more...
Enriching data using Kafka's KStreams and KTables
Getting ready
How to do it...
How it works...
There's more...
Deploying to the Edge
OTA updating MCUs
Getting ready
How to do it...
How it works...
There's more...
Deploying modules with IoT Edge
Getting ready
Setting up our Raspberry Pi
Coding setup
How to do it...
How it works...
There's more...
Offloading to the web with TensorFlow.js
Getting ready
How to do it...
How it works...
There's more...
Deploying mobile models
Getting ready
How to do it...
How it works...
Maintaining your fleet with device twins
Getting ready
How to do it...
How it works...
There's more...
Enabling distributed ML with fog computing
Getting ready
How to do it...
How it works...
There's more...
About Packt
Why subscribe?
Other Books You May Enjoy
Packt is searching for authors like you
Leave a review - let other readers know what you think
Preface
Artificial intelligence (AI) is rapidly finding practical applications across a wide variety of industry verticals, and the Internet of Things (IoT) is one of them. Developers are looking for ways to make IoT devices smarter and to make users' lives easier. With this AI cookbook, you'll learn how to implement smart analytics using IoT data to gain insights, predict outcomes, and make informed decisions, along with covering advanced AI techniques that facilitate analytics and learning in various IoT applications.
Using a recipe-based approach, the book will take you through essential processes such as data collection, data analysis, modeling, statistics and monitoring, and deployment. You'll use real-life datasets from smart homes, industrial IoT, and smart devices to train and evaluate simple and complex models and make predictions using trained models. Later chapters will take you through the key challenges faced while implementing machine learning, deep learning, and other AI techniques such as natural language processing (NLP), computer vision, and embedded machine learning to build smart IoT systems. In addition to this, you'll learn how to deploy models and improve their performance with ease.
By the end of this book, you'll be able to package and deploy end-to-end AI apps and apply best practice solutions to common IoT problems.
If you're an IoT practitioner looking to incorporate AI techniques to build smart IoT solutions without having to trawl through a lot of AI theory, this AI IoT book is for you. Data scientists and AI developers who want to build IoT-focused AI solutions will also find this book useful. Knowledge of the Python programming language and basic IoT concepts is required to grasp the concepts covered in this AI book effectively.
Chapter 1, Setting Up the IoT and AI Environment, will focus on getting the right environment set up for success. You will learn how to choose a device that meets your needs for AI, whether that model needs to be on the edge or in the cloud. You will also learn how to securely communicate with modules within a device, other devices, or the cloud. Finally, you will set up a way to ingest data in the cloud and then set up Spark and AI tools to perform analysis of data, train models, and run machine learning models at scale.
Chapter 2, Handling Data, talks about the basics of ensuring that data in any format can be used by data scientists effectively.
Chapter 3, Machine Learning for IoT, will discuss using machine learning models such as logistic regression and decision trees to solve common IoT issues such as classifying medical results, detecting unsafe drivers, and classifying chemical readings.
Chapter 4, Deep Learning for Predictive Maintenance, will focus on various classification techniques to enable IoT devices to be smart devices.
Chapter 5, Anomaly Detection, will explain how when alarm detection does not classify a particular issue, it can lead to the discovery of issues, and how if a device is acting in an anomalous way, you might want to send out a repair worker to examine the device.
Chapter 6, Computer Vision, will discuss implementing computer vision in the cloud as well as on edge devices such as NVIDIA Jetson Nano.
Chapter 7, NLP and Bots for a Self-Ordering Kiosk, will discuss using NLP and using bots to enable interaction with users ordering foods at a restaurant kiosk.
Chapter 8, Optimizing with Microcontrollers and Pipelines, will discuss how reinforcement learning can be used with a smart traffic intersection to make traffic light decisions that decrease the wait time at traffic lights and allow traffic to flow better.
Chapter 9, Deploying to the Edge, will discuss various ways of applying pre-trained machine learning models to an edge device. This chapter will discuss IoT Edge in detail. Deploying is an important part of the AI pipeline. This chapter will also talk about deploying machine learning models to web applications and mobile using TensorFlow.js and ONNX.
Readers should have a basic understanding of software development. This book uses the Python, C, Java languages. A basic understanding of how to install libraries and packages in these languages as well as basic coding concepts such as arrays and loops will be helpful. A few websites that can help you brush up on the basics of different languages are:
https://www.learnpython.org/
https://www.learnjavaonline.org/
https://www.learn-c.org/
To get the most out of this book a basic understanding of machine learning principles will be beneficial. The hardware used in this book are off the shelf sensors and common IoT development kits and can be purchased from sites such as Adafruit.com and Amazon.com. Most of the code is portable across devices. Device code written in Python can be easily ported to a variety of microprocessors such as a Raspberry Pi, Nvidia Jetson, Lotte Panda, or sometimes even a PC. While code written in C can be ported to a variety of microcontrollers such as the ESP32, ESP8266, and Arduino. Code written in Java can be ported to any android device such as a tablet or phone.
This book uses Databricks for some of the experiments. Databricks has a free version at https://community.cloud.databricks.com.
If you are using the digital version of this book, we advise you to type the code yourself or access the code via the GitHub repository (link available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Artificial-Intelligence-for-IoT-Cookbook. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781838981983_ColorImages.pdf.
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "This will give you a list of the running containers. Then, open the /data folder."
A block of code is set as follows:
import numpy as np import torchfrom torch import nnfrom torch import optimimport torch.nn.functional as Ffrom torchvision import datasets, transforms, modelsfrom torch.utils.data.sampler import SubsetRandomSampler
Any command-line input or output is written as follows:
cd jetson-inference
mkdir build
cd build
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Click on the New project tile. Then, fill out the Create new project wizard."
In this book, you will find several headings that appear frequently (Getting ready, How to do it..., How it works..., There's more..., and See also).
To give clear instructions on how to complete a recipe, use these sections as follows:
This section tells you what to expect in the recipe and describes how to set up any software or any preliminary settings required for the recipe.
This section contains the steps required to follow the recipe.
This section usually consists of a detailed explanation of what happened in the previous section.
This section consists of additional information about the recipe in order to make you more knowledgeable about the recipe.
This section provides helpful links to other useful information for the recipe.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
Setting Up the IoT and AI Environment
The Internet of Things (IoT) and artificial intelligence (AI) are leading to a dramatic impact on people's lives. Industries such as medicine are being revolutionized by wearable sensors that can monitor patients after they leave the hospital. Machine learning (ML) used on industrial devices is leading to better monitoring and less downtime with techniques such as anomaly detection, predictive maintenance, and prescriptive actions.
Building an IoT device capable of delivering results relies on gathering the right information. This book gives recipes that support the end-to-end IoT/ML life cycle. The next chapter has recipes for making sure that devices have the right sensors and the data is the best it can be for ML outcomes. Tools such as explanatory factor analysis and data collection design are used.
This chapter will cover the following topics:
Choosing a device
Setting up Databricks
The following recipes will be covered:
Setting up IoT Hub
Setting up an IoT Edge device
Deploying ML modules to Edge devices
Setting up Kafka
Installing ML libraries on Databricks
Before starting with the classic recipe-by-recipe formatting of a cookbook, we'll start by covering a couple of base topics. Choosing the right hardware sets the stage for AI. Working with IoT means working with constraints. Using ML in the cloud is often a cost-effective solution as long as the data is small. Image, video, and sound data will often bog down networks. Worse yet, if you are using a cellular network, it can be highly expensive. The adage there is no money in hardware refers to the fact that most of the money made from IoT comes from the selling of services, not from producing expensive devices.
Often, companies have their devices designed by electrical engineers. This is a cost-effective option. Custom boards do not have extra components, such as unnecessary Bluetooth or extra USB ports. However, predicting CPU and RAM requirements of an ML model at board design time is difficult. Starter kits can be useful tools to use until the hardware requirements are understood. The following boards are among the most widely adopted boards on the market:
Manifold 2-C with NVIDIA TX2
The i.MX series
LattePanda
Raspberry Pi Class
Arduino
ESP8266
They are often used as a scale of functionality. A Raspberry Pi Class device, for example, would struggle with custom vision applications but would do great for audio or general ML applications. One determining factor for many data scientists is the programming language. The ESP8266 and Arduino need to be programmed in a low-level language such as C or C++, while devices such as Raspberry Pi Class or above can be programmed in any language.
Different devices come at different prices and functionalities. Devices that are Raspberry Pi Class or above can handle ML running on the Edge, reducing cloud cost but increasing the cost of the device. Deciding on whether you are billing your customers with a one-time price for the device or a subscription model may help you determine what type of device you need.
The NVIDIA Jetson is one of the best choices for running complex ML models such as real-time video on the Edge. The NVIDIA Jetson comes with a built-in NVIDIA GPU. The Manifold version of the product is designed to fit onto a DJI drone and perform tasks such as image recognition or self-flying. The only downside to running NVIDIA Jetson is its use of the ARM64 architecture. ARM64 does not work well with TensorFlow, although other libraries such as PyTorch work fine on ARM64. The Manifold retails for $500, which makes it a high-price option, but this is often necessary when doing real-time ML on the Edge:
Price
Typical Models
Use Cases
$500
Re-enforcement learning, computer vision
Self-flying drones, robotics
The i.MX series of chips is open source and boasts impressive RAM and CPU capabilities. The open design helps engineers build boards easily. The i.MX series uses Freescale semiconductors. Freescale semiconductors have guaranteed production life runs of 10 through 15 years, which means the board design will be stable for years. The i.MX 6 can range from $200 to $300 in cost and can handle CPU-intensive tasks easily, such as object recognition in live streaming video:
Price
Typical Models
Use Cases
$200+
Computer vision, NLP
Sentiment analysis, face recognition, object recognition, voice recognition
Single Board Computers (SBCs) such as the LattePanda are capable of running heavy sensor workloads. These devices can often run Windows or Linux. Like the i.MX series, they are capable of running object recognition on the device; however, the frame rate for recognizing objects can be slow:
Price
Typical Models
Use Cases
$100+
Face detection, voice recognition, high-speed Edge models
Audio-enabled kiosk, high-frequency heart monitoring
Raspberry Pis are a standard starter kit for IoT. With their $35 price tag, they give you a lot of capability for the cost: they can run ML on the Edge with containers. They have a Linux or IoT Core operating system, which allows the easy plugging and playing of components and a community of developers building similar platform tools. Although Raspberry Pi Class devices are capable of handling most ML tasks, they tend to have performance issues on some of the more intensive tasks, such as video recognition:
Price
Typical Models
Use Cases
$35
Decision trees, artificial neural networks, anomaly detection
Smart home, industrial IoT
At $15, the Arduino is a cost-effective solution. Arduino is supported by a large community and uses the Arduino language, a set of C/C++ functions. If you need to run ML models on an Arduino device, it is possible to package ML models built on popular frameworks such as PyTorch into the Embedded Learning Library (ELL). The ELL allows ML models to be deployed on the device without needing the overhead of a large operating system. Porting ML models using ELL or TensorFlow Lite can be challenging due to the limited memory and compute capacity of the Arduino:
Price
Typical Models
Use Cases
$15
Linear regression
Sensor reading classification
At under $5, devices such as the ESP8266 and smaller represent a class of devices that take data in and transmit it to the cloud for ML evaluations. Besides being inexpensive, they are also often low-power devices, so they can be powered by solar power, network power, or a long-life battery:
Price
Typical Models
Use Cases
$5 or below
In the cloud only
In the cloud only
Processing large amounts of data is not possible on a single computer. That is where distributed systems such as Spark (made by Databricks) come in. Spark allows you to parallelize large workloads over many computers.
Spark was developed to help solve the Netflix Prize, which had a $1 million prize for the team that made the best recommendation engine. Spark uses distributed computing to wrangle large and complex datasets. There are distributed Python equivalent libraries, such as Koalas, which is a distributed equivalent of pandas. Spark also supports analytics and feature engineering that requires a large amount of compute and memory, such as graph theory problems. Spark has two modes: a batch mode for training large datasets and a streaming mode for scoring data in near real time.
IoT data tends to be large and imbalanced. A device may have 10 years of data showing it is running in normal conditions and only a few records showing it needs to be shut down immediately to prevent damage. The value of Databricks in IoT is twofold. The first is working with data and training models. Working with data at the terabyte and petabyte scale can overwhelm a single machine. Databricks solves this with its ability to scale out. The second is its streaming capabilities. ML models can be run in the cloud in near real time. Messages can then be pushed back down to the device.
Setting up Databricks is fairly straightforward. You can either go to your cloud provider and sign up for an account in the portal or sign up for the free community edition. If you are taking your product to production, then you should definitely sign up with Azure, AWS, or Google Cloud.
IoT and ML are fundamentally a big data problem. A device may send telemetry for years before it sends telemetry that would indicate an issue with the device. Searching through millions or billions of records to find the few records that are needed can be challenging from a data management perspective. Therefore, optimal data storage is key.
Today, there are tools that make it easy to work with large amounts of data. There are a few things to remember though. There are optimal ways of storing data at scale that can make dealing with large datasets easier.
Working with data, the type of large datasets that come from IoT devices can be prohibitively expensive for many companies. Storing data in Delta Lake, for example, can give the user a 340-times performance boost over accessing the data over JSON. The next three sections will introduce three storage methods that can cut down a data analytics job from weeks to hours.
Parquet is one of the most common file formats in big data. Parquet's columnar storage format allows it to store highly compressed data. Its advantage is that it takes up less space on the hard disk and takes up less network bandwidth, making it ideal for loading into a DataFrame. Parquet ingestion into Spark has been benchmarked at 34 times the speed of JSON.
The Avro format is a popular storage format for IoT. While it does not have the high compression ratio that Parquet does, it is less compute expensive to store data because it uses a row-level data storage schema. Avro is a common format for streaming data such as IoT Hub or Kafka.
Delta Lake is an open source project released by Databricks in 2019. It stores files in Parquet. In addition, it is able to keep track of data check-ins, enabling the data scientist to look at data as it existed at a given time. This can be useful when trying to determine why accuracy in a particular ML model drifted. It also keeps metadata about the data, giving it a 10-times performance increase over standard Parquet for analytics workloads.
While considerations are given to both choosing a device and setting up Databricks, the rest of this chapter will follow a modular, recipe-based format.
