Federated Learning with Python - Kiyoshi Nakayama PhD - E-Book

Federated Learning with Python E-Book

Kiyoshi Nakayama PhD

0,0
33,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Federated learning (FL) is a paradigm-shifting technology in AI that enables and accelerates machine learning (ML), allowing you to work on private data. It has become a must-have solution for most enterprise industries, making it a critical part of your learning journey. This book helps you get to grips with the building blocks of FL and how the systems work and interact with each other using solid coding examples.
FL is more than just aggregating collected ML models and bringing them back to the distributed agents. This book teaches you about all the essential basics of FL and shows you how to design distributed systems and learning mechanisms carefully so as to synchronize the dispersed learning processes and synthesize the locally trained ML models in a consistent manner. This way, you’ll be able to create a sustainable and resilient FL system that can constantly function in real-world operations. This book goes further than simply outlining FL's conceptual framework or theory, as is the case with the majority of research-related literature.
By the end of this book, you’ll have an in-depth understanding of the FL system design and implementation basics and be able to create an FL system and applications that can be deployed to various local and cloud environments.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 418

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Federated Learning with Python

Design and implement a federated learning system and develop applications using existing frameworks

Kiyoshi Nakayama, PhD

George Jeno

BIRMINGHAM—MUMBAI

Federated Learning with Python

Copyright © 2022 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Publishing Product Manager: Dinesh Chaudhary

Senior Editor: Nathanya Dias

Content Development Editor: Shreya Moharir

Technical Editor: Devanshi Ayare

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Manju Arasan

Production Designer: Roshan Kawale

Marketing Coordinators: Shifa Ansari

First published: October 2022

Production reference: 1141022

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80324-710-6

www.packt.com

Acknowledgments

We would like to thank Dr. Norikazu Furukawa for contributing to Chapter 1, Challenges in Big Data and Traditional AI, and Anthony Maddalone for contributing to Chapter 9, Case Studies with Key Use Cases of Federated Learning Applications, with their great insight into current trends, challenges, and ongoing efforts in the machine learning field and also its future direction. We also thank Dr. Genya Ishigaki for his contribution to the code of the GitHub repository used throughout this book. We acknowledge the contribution of Dr. Jose Barreiros to the content related to the robotics use case.

Contributors

About the authors

Kiyoshi Nakayama, PhD, is the founder and CEO of TieSet Inc., which leads the development and dissemination of one of the most advanced distributed and federated learning platforms in the world. Before founding TieSet, he was a research scientist at NEC Laboratories America, renowned for having the world’s top-notch machine learning research group of researchers. He was also a postdoctoral researcher at Fujitsu Laboratories of America, where he implemented a distributed system for smart energy. He has published several international articles and patents and received the best paper award twice in his career. Kiyoshi received his PhD in computer science from the University of California, Irvine.

George Jeno is a co-founder of TieSet Inc. and has been a tech lead for the development of the STADLE federated learning platform. He has a deep understanding of machine learning theory and system architecture design, and he has leveraged this knowledge to research new algorithms and applications for distributed and federated learning. He holds a master’s degree in computer science (with a specialization in machine learning) from Georgia Tech.

About the reviewer

Sougata Pal is a passionate technology specialist, working as an enterprise architect in software architecture design and application scalability management, team building, and management. With over 15 years of experience, she has worked with different start-ups and large-scale enterprises to develop their business application infrastructures, enhancing their reach to customers. Having contributed to different open source projects on GitHub to empower the open source community, for the last couple of years, Sougata has been playing around with federated learning and cybersecurity algorithms to enhance the performance of cybersecurity processes by introducing concepts of federated learning.

Table of Contents

Preface

Part 1 Federated Learning – Conceptual Foundations

1

Challenges in Big Data and Traditional AI

Understanding the nature of big data

Definition of big data

Big data now

Triple-A mindset for big data

Data privacy as a bottleneck

Risks in handling private data

Increased data protection regulations

From privacy by design to data minimalism

Impacts of training data and model bias

Expensive training of big data

Model bias and training data

Model drift and performance degradation

How models can stop working

Continuous monitoring – the price of letting causation go

FL as the main solution for data problems

Summary

Further reading

2

What Is Federated Learning?

Understanding the current state of ML

What is a model?

ML – automating the model creation process

Deep learning

Distributed learning nature – toward scalable AI

Distributed computing

Distributed ML

Edge inference

Edge training

Understanding FL

Defining FL

The FL process

FL system considerations

Security for FL systems

Decentralized FL and blockchain

Summary

Further reading

3

Workings of the Federated Learning System

FL system architecture

Cluster aggregators

Distributed agents

Database servers

Intermediate servers for low computational agent devices

Understanding the FL system flow – from initialization to continuous operation

Initialization of the database, aggregator, and agent

Initial model upload process by initial agent

Overall FL cycle and process of the FL system

Synchronous and asynchronous FL

The aggregator-side FL cycle and process

The agent-side local retraining cycle and process

Model interpretation based on deviation from baseline outputs

Basics of model aggregation

What exactly does it mean to aggregate models?

FedAvg – Federated averaging

Furthering scalability with horizontal design

Horizontal design with semi-global model

Distributed database

Asynchronous agent participation in a multiple-aggregator scenario

Semi-global model synthesis

Summary

Further reading

Part 2 The Design and Implementation of the Federated Learning System

4

Federated Learning Server Implementation with Python

Technical requirements

Main software components of the aggregator and database

Aggregator-side codes

lib/util codes

Database-side code

Toward the configuration of the aggregator

Implementing FL server-side functionalities

Importing libraries for the FL server

Defining the FL Server class

Initializing the FL server

Registration function of agents

The server for handling messages from local agents

The global model synthesis routine

Functions to send the global models to the agents

Functions to push the local and global models to the database

Maintaining models for aggregation with the state manager

Importing the libraries of the state manager

Defining the state manager class

Initializing the state manager

Initializing a global model

Checking the aggregation criteria

Buffering the local models

Clearing the saved models

Adding agents

Incrementing the FL round

Aggregating local models

Importing the libraries for the aggregator

Defining and initializing the aggregator class

Defining the aggregate_local_models function

The FedAvg function

Running the FL server

Implementing and running the database server

Toward the configuration of the database

Defining the database server

efining the database with SQLite

Running the database server

Potential enhancements to the FL server

Redesigning the database

Automating the registry of an initial model

Performance metrics for local and global models

Fine-tuned aggregation

Summary

5

Federated Learning Client-Side Implementation

Technical requirements

An overview of FL client-side components

Distributed agent-side code

Configuration of an agent

Implementing FL client-side main functionalities

Importing libraries for an agent

Defining the Client class

Initializing the client

Agent participation in an FL cycle

Model exchange synchronization

Push and polling implementation

Designing FL client libraries

Starting FL client core threads

Saving global models

Manipulating client state

Sending local models to aggregator

Waiting for global models from an aggregator

Local ML engine integration into an FL system

Importing libraries for a local ML engine

Defining the ML models, training, and test functions

Integration of client libraries into your local ML engine

An example of integrating image classification into an FL system

Integration of client libraries into the IC example

Summary

6

Running the Federated Learning System and Analyzing the Results

Technical requirements

Configuring and running the FL system

Installing the FL environment

Configuring the FL system with JSON files for each component

Running the database and aggregator on the FL server

Running a minimal example with the FL client

Data and database folders

Databases with SQLite

Understanding what happens when the minimal example runs

Running just one minimal agent

Running two minimal agents

Running image classification and analyzing the results

Preparing the CIFAR-10 dataset

The ML model used for FL with image classification

How to run the image classification example with CNN

Evaluation of running the image classification with CNN

Running five agents

Summary

7

Model Aggregation

Technical requirements

Revisiting aggregation

Understanding FedAvg

Dataset distributions

Computational power distributions

Protecting against adversarial agents

Modifying aggregation for non-ideal cases

Handling heterogeneous computational power

Adversarial agents

Non-IID datasets

Summary

Part 3 Moving Toward the Production of Federated Learning Applications

8

Introducing Existing Federated Learning Frameworks

Technical requirements

TensorFlow Federated

OpenFL

IBM FL

Flower

STADLE

Introduction to FL frameworks

Flower

TensorFlow Federated (TFF)

OpenFL

IBM FL

STADLE

PySyft

Example – the federated training of an NLP model

Defining the sentiment analysis model

Creating the data loader

Training the model

Adopting an FL training approach

Integrating TensorFlow Federated for SST-2

Integrating OpenFL for SST-2

Integrating IBM FL for SST-2

Integrating Flower for SST-2

Integrating STADLE for SST-2

Example – the federated training of an image classification model on non-IID data

Skewing the CIFAR-10 dataset

Integrating OpenFL for CIFAR-10

Integrating IBM FL for CIFAR-10

Integrating Flower for CIFAR-10

Integrating STADLE for CIFAR-10

Summary

9

Case Studies with Key Use Cases of Federated Learning Applications

Applying FL to the healthcare sector

Challenges in healthcare

Medical imaging

Drug discovery

EHRs

Applying FL to the financial sector

Anti-Money Laundering (AML)

Proposed solutions to the existing AML approach

Demo of FL in the AML space

Benefits of FL for risk detection systems

FL meets edge computing

Edge computing with IoT over 5G

Edge FL example – object detection

Making autonomous driving happen with FL

Applying FL to robotics

Moving toward the Internet of Intelligence

Introducing the IoFT

Understanding the role of FL in Web 3.0

Applying FL to distributed learning for big data

Summary

References

Further reading

10

Future Trends and Developments

Looking at future AI trends

The limitation of centralized ML

Revisiting the benefits of FL

Toward distributed learning for further privacy and training efficiency

Ongoing research and developments in FL

Exploring various FL types and approaches

Understanding enhanced distributed learning frameworks with FL

Journeying on to collective intelligence

Intelligence-centric era with collective intelligence

Internet of Intelligence

Crowdsourced learning with FL

Summary

Further reading

Appendix: Exploring Internal Libraries

Technical requirements

Overview of the internal libraries for the FL system

states.py

communication_handler.py

data_struc.py

helpers.py

messengers.py

Enumeration classes for implementing the FL system

Importing libraries to define the enumeration classes

IDPrefix defining the FL system components

Client state classes

List of classes defining the types of ML models and messages

List of state classes defining message location

Understanding communication handler functionalities

Importing libraries for the communication handler

Functions of the communication handler

Understanding the data structure handler class

Importing libraries for the data structure handler

The LimitedDict class

Understanding helper and supporting libraries

Importing libraries for helper libraries

Functions of the helper library

Messengers to generate communication payloads

Importing libraries for messengers

Functions of messengers

Summary

Index

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Preface

Federated learning (FL) is becoming a paradigm-shifting technology in AI because it is often said that with the FL framework, it is the machine learning model that needs to move around across the Internet, not the data itself, for the intelligence to continuously evolve and grow. Therefore, people call FL a model-centric approach, compared to the traditional data-centric approach, thus it is considered a game-changing technology. The idea of the model-centric approach can create an intelligence-centric platform to pioneer the wisdom-driven world.

By adopting FL, you can overcome the challenges that big data AI has been facing for a long time, such as data privacy, training cost and efficiency, and delay in the delivery of the most updated intelligence. However, FL is not a magic solution to resolve all the issues in big data just by aggregating machine learning models blindly. We need to design the distributed systems and learning mechanisms very carefully to synchronize all the distributed learning processes and synthesize all the locally trained machine learning models consistently. That way, we can create a sustainable and resilient FL system that can continuously function even in a real operation at scale.

Therefore, this book goes beyond just describing the conceptual and theoretical aspects of FL as seen in many research projects with simulators or prototypes that have been introduced in most of the literature related to this field. Rather, you will learn about the entire design and implementation principles by looking into the codes of the simplified federated learning system to validate the workings and results of the framework.

By the end of this book, you will create your first application based on federated learning that can be installed and tested in various settings in both local and cloud environments.

Who this book is for

This book is for machine learning engineers, data scientists, and AI enthusiasts who want to learn about creating machine learning applications empowered by FL. You will need basic knowledge of Python programming and machine learning concepts to get started with this book.

What this book covers

Chapter 1, Challenges in Big Data and Traditional AI, is all about orienting you to understand the current issues with big data systems and the traditional centralized machine learning approach and how FL could resolve those problems, such as data privacy, bias and silos, model drifts, and performance degradation. This will prepare you to dive deeper into the design and implementation of the FL system.

Chapter 2, What Is Federated Learning?, continues the introduction to FL and will help you understand the FL concept with the current trend of distributed learning as well as machine learning basics. You will also learn about the benefits of FL as a model-centric machine learning approach in terms of data privacy and security as well as efficiency and scalability.

Chapter 3, Workings of the Federated Learning System, gives you a solid understanding of how the FL system will work and interact among distributed components within the FL system itself. You will learn about the fundamental architecture of the FL system together with its procedure flow, state transition, and sequence of messaging toward the continuous operation of the FL system.

Chapter 4, Federated Learning Server Implementation with Python, guides you through learning the implementation principles of the FL server-side system, including database server and aggregator modules. The chapter also talks about communication among the FL components and model aggregation, as well as how to manage the states of the distributed learning agents and local and global models.

Chapter 5, Federated Learning Client-Side Implementation, describes the FL client-side functionalities as well as libraries that can be used by local machine learning engines and processes. The core client-side functionalities include registration of a distributed learning agent with the FL platform, global model receipt, and local model upload and sharing.

Chapter 6, Running the Federated Learning System and Analyzing the Results, will help you run a simplified FL framework to understand the behavior of the FL system and procedure as well as the most standard model aggregation method of federated averaging. You will also be able to validate the outcome of running the FL framework by analyzing the results of two example test cases.

Chapter 7, Model Aggregation, is an important chapter to understand model aggregation, which is the foundation of FL. You will understand how different characterizations of an FL scenario call for various aggregation methods and should have an idea of how these algorithms can actually be implemented.

Chapter 8, Introducing Existing Federated Learning Frameworks, explains the existing FL projects and frameworks, such as PySyft, TFF, Flower, OpenFL, and STADLE, and their APIs. There are many useful FL projects with different design philosophies, and you will understand the features and differences among those frameworks.

Chapter 9, Case Studies with Key Use Cases of Federated Learning Applications, introduces some of the major use cases of FL in different industries. You’ll become familiar with some of the applications of FL in various fields, such as medical and healthcare, the financial sector, edge computing, the Internet of Things, and distributed learning for big data, in which FL has shown significant potential to overcome many important technological challenges.

Chapter 10, Future Trends and Developments, describes the future direction of AI technologies driven by ongoing research and development of FL. You will be introduced to the new perspective of the Internet of Intelligence and wisdom-centric platforms. Thus, you will be ready to welcome the world of collective intelligence.

Appendix, Exploring Internal Libraries, provides an overview of internal libraries, including enumeration classes for implementing the FL systems, communication protocol, data structure handler, and helper and supporting functions.

To get the most out of this book

You will need Python version 3.7+ installed on your computer. To create a virtual environment to easily run the code examples in the book, Anaconda is recommended to be installed on macOS or Linux.

Software/hardware covered in the book

Operating system requirements

Python 3.7+

macOS, or Linux

Anaconda environment

GitHub

You can install the server-side code provided in the GitHub repo in any cloud environment, such as Amazon Web Services (AWS) or Google Cloud Platform (GCP), with proper security settings to set up distributed learning environments.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Federated-Learning-with-Python. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Please also check the repository for the most updated code at https://github.com/tie-set/simple-fl.

Note

You can use the code files for personal or educational purposes. Please note that we will not support the deployment for commercials use and will not be responsible for any errors, issues, or damages caused by using the code.

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/qh1su.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The server code imports StateManager and Aggregator for the FL processes.”

A block of code is set as follows:

import tensorflow as tf from tensorflow import keras from sst_model import SSTModel

Any command-line input or output is written as follows:

fx envoy start -n envoy_1 - -disable-tls --envoy-config-path envoy_config_1.yaml -dh localhost -dp 50051

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Federated Learning with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Preface

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/978-1-80324-710-6

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

Part 1 Federated Learning – Conceptual Foundations

In this part, you will learn about the challenges in the big data AI and centralized traditional machine learning approaches and how federated learning (FL) can address their major problems. You will learn the basic concepts and workings of the FL system together with some machine learning basics and distributed systems and computing principles.

This part comprises the following chapters:

Chapter 1, Challenges in Big Data and Traditional AIChapter 2, What Is Federated Learning?Chapter 3, Workings of the Federated Learning System