29,99 €
Although creating a machine learning pipeline or developing a working prototype of a software system from that pipeline is easy and straightforward nowadays, the journey toward a professional software system is still extensive. This book will help you get to grips with various best practices and recipes that will help software engineers transform prototype pipelines into complete software products.
The book begins by introducing the main concepts of professional software systems that leverage machine learning at their core. As you progress, you’ll explore the differences between traditional, non-ML software, and machine learning software. The initial best practices will guide you in determining the type of software you need for your product. Subsequently, you will delve into algorithms, covering their selection, development, and testing before exploring the intricacies of the infrastructure for machine learning systems by defining best practices for identifying the right data source and ensuring its quality.
Towards the end, you’ll address the most challenging aspect of large-scale machine learning systems – ethics. By exploring and defining best practices for assessing ethical risks and strategies for mitigation, you will conclude the book where it all began – large-scale machine learning software.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 484
Veröffentlichungsjahr: 2024
Machine Learning Infrastructure and Best Practices for Software Engineers
Take your machine learning software from a prototype to a fully fledged software system
Miroslaw Staron
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Yasir Ali Khan
Book Project Manager: Hemangi Lotlikar
Senior Editor: Sushma Reddy
Technical Editor: Kavyashree K S
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Hemangini Bari
Production Designer: Gokul Raj S.T
DevRel Marketing Coordinator: Vinishka Kalra
First published: January 2024
Production reference: 1170124
Published by
Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK
ISBN 978-1-83763-406-4
www.packtpub.com
Writing a book with a lot of practical examples requires a lot of extra time, which is often taken from family and friends. I dedicate this book to my family – Alexander, Cornelia, Viktoria, and Sylwia – who always supported and encouraged me, and to my parents and parents-in-law, who shaped me to be who I am.
– Miroslaw Staron
Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner’s Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
I would like to thank my family for their support in writing this book. I would also like to thank my colleagues from the Software Center program who provided me with the ability to develop my ideas and knowledge in this area – in particular, Wilhelm Meding, Jan Bosch, Ola Söder, Gert Frost, Martin Kitchen, Niels Jørgen Strøm, and several other colleagues. One person who really ignited my interest in this area is of course Mirosław “Mirek” Ochodek, to whom I am extremely grateful. I would also like to thank the funders of my research, who supported my studies throughout the years. I would like to thank my Ph.D. students, who challenged me and encouraged me to always dig deeper into the topics. I’m also very grateful to the reviewers of this book – Hongyi Zhang and Sushant K. Pandey, who provided invaluable comments and feedback for the book. Finally, I would like to extend my gratitude to my publishing team – Hemangi Lotlikar, Sushma Reddy, and Anant Jaint – this book would not have materialized without you!
Hongyi Zhang is a researcher at Chalmers University of Technology with over five years of experience in the fields of machine learning and software engineering. Specializing in machine learning, edge/cloud computing, and software engineering, his research merges machine learning theory and software applications, driving tangible improvements in industrial machine learning ecosystems.
Sushant Kumar Pandey is a dedicated post-doctoral researcher at the Department of CSE, Chalmers at the University of Gothenburg, Sweden, who seamlessly integrates academia with industry, collaborating with Volvo Cars in Gothenburg. Armed with a Ph.D. in CSE from the esteemed Indian Institute of Technology (BHU), India, Sushant specializes in the application of AI in software engineering. His research advances technology’s transformative potential. As a respected reviewer for prestigious venues such as IST, KBS, EASE, and ESWA, Sushant actively contributes to shaping the discourse in his field. Beyond research, he leverages his expertise to mentor students, fostering innovation and excellence in the next generation of professionals.
Traditionally, Machine Learning (ML) was considered to be a niche domain in software engineering. No large software systems used statistical learning in production. This changed in the 2010s when recommendation systems started to utilize large quantities of data – for example, to recommend movies, books, or music. With the rise of transformer technologies, this has changed. Commonly known products such as ChatGPT popularized these techniques and showed that they are no longer niche products, but have entered the mainstream software products and services. Software engineering needs to keep up and we need to know how to create the software based on these modern machine learning models. In this first part of the book, we look at how machine learning changes software development and how we need to adapt to these changes.
This part has the following chapters:
Chapter 1, Machine Learning Compared to Traditional SoftwareChapter 2, Elements of a Machine Learning SystemChapter 3, Data in Software Systems – Text, Images, Code, and FeaturesChapter 4, Data Acquisition, Data Quality, and NoiseChapter 5, Quantifying and Improving Data Properties