44,39 €
Whether you are a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and how to implement state-of-the-art architectures for real-world tasks.
The second edition of Modern Computer Vision with PyTorch is fully updated to explain and provide practical examples of the latest multimodal models, CLIP, and Stable Diffusion.
You’ll discover best practices for working with images, tweaking hyperparameters, and moving models into production. As you progress, you'll implement various use cases for facial keypoint recognition, multi-object detection, segmentation, and human pose detection. This book provides a solid foundation in image generation as you explore different GAN architectures. You’ll leverage transformer-based architectures like ViT, TrOCR, BLIP2, and LayoutLM to perform various real-world tasks and build a diffusion model from scratch. Additionally, you’ll utilize foundation models' capabilities to perform zero-shot object detection and image segmentation. Finally, you’ll learn best practices for deploying a model to production.
By the end of this deep learning book, you'll confidently leverage modern NN architectures to solve real-world computer vision problems.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 836
Veröffentlichungsjahr: 2024
Modern Computer Vision with PyTorch
Second Edition
A practical roadmap from deep learning fundamentals to advanced applications and Generative AI
V Kishore Ayyadevara
Yeshwanth Reddy
Modern Computer Vision with PyTorch
Second Edition
Copyright © 2024 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Bhavesh Amin
Acquisition Editor – Peer Reviews: Tejas Mhasvekar
Project Editor: Parvathy Nair
Content Development Editor: Shruti Menon
Copy Editor: Safis Editing
Technical Editor: Aneri Patel
Proofreader: Safis Editing
Indexer: Manju Arasan
Presentation Designer: Pranit Padwal
Developer Relations Marketing Executive: Monika Sangwan
First published: November 2020
Second edition: June 2024
Production reference: 2311225
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80323-133-4
www.packt.com
Kishore Ayyadevara is an entrepreneur and a hands-on leader working at the intersection of technology, data, and AI to identify and solve business problems. With over a decade of experience in leadership roles, Kishore has established and grown successful applied data science teams at American Express and Amazon, as well as a top health insurance company. In his current role, he is building a start-up focused on making AI more accessible to healthcare organizations. Outside of work, Kishore has shared his knowledge through his five books on ML/AI, is an inventor with 12 patents, and has been a speaker at multiple AI conferences.
I would like to dedicate this book to my dear parents, Hema and Subrahmanyeswara Rao, my lovely wife, Sindhura, my dearest daughter, Hemanvi, and my beloved son, Tejas. This would not have been possible without their patience, support, and encouragement.
Special thanks to the reviewers for their helpful feedback. This book would not have been in this shape without the great support and feedback I received from Raghav Bali, Prassanna Venkatesh, Sreevaatsav Bavana, and the Packt team - Shruti, Parvathy, Aneri, and Pranit.
Yeshwanth Reddy is a senior data scientist with industry experience spanning over 8 years, specializing in healthcare, education, and document extraction domains. He has given several talks and has mentored thousands of students on a broad spectrum of topics, ranging from statistics to deep learning. His innovative solutions include the development of products and libraries for document extraction and the creation of synthetic data to enhance real-world datasets. Yeshwanth has also contributed to multiple open-source libraries and holds various patents.
I would like to thank my dear parents, Lalitha and Ravi, my beloved wife, Madhuri, and my brother, Sumanth. Your unwavering support and encouragement have been the driving force behind this book. I extend my heartfelt gratitude to the reviewers for their invaluable feedback throughout the authorship journey.
Raghav Bali is a Staff Data Scientist at Delivery Hero, one the world’s leading food delivery service based out of Berlin, Germany. Raghav has published multiple peer-reviewed papers, has authored more than 7 books, and is a co-inventor of 10+ patents in the areas of ML, deep learning, healthcare, and natural language processing. He has 12+ years of experience working across organizations such as Intel, American Express, Infosys, UnitedHealth Group, and DeliveryHero, developing enterprise-level solutions for real world use-cases.
I would like to take this opportunity to thank the whole team at Packt for their support in making the review process as smooth as possible. I would also like to thank my family for all the support and countless coffee cups. Finally, I would like to wish Kishore and Yeshwanth all the very best for this amazingly enhanced second edition of their already successful book
Sheallika Singh is a deep learning expert and advisor to multiple ML startups. Currently, she is a Staff Machine Learning Engineer, responsible for developing personalization models used by billions of users worldwide. Sheallika has also played a pivotal role in advancing self-driving car technology. She has published research in and serves as a program committee member for top-tier ML conferences. Before entering the industry, Sheallika conducted research on font-free character recognition. She holds a Master’s degree in Data Science from Columbia University and a Bachelor of Science degree in Mathematics and Scientific Computing, with a minor in Industrial Management, from the Indian Institute of Technology Kanpur.
Join our community’s Discord space for discussions with the authors and other readers:
https://packt.link/modcv
Once you’ve read Modern Computer Vision with PyTorch, Second Edition, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.
This book comes with free benefits to support your learning. Activate them now for instant access (see the “How to Unlock” section for instructions).
Here’s a quick overview of what you can instantly unlock with your purchase:
PDF and ePub Copies
Next-Gen Web-Based Reader
Access a DRM-free PDF copy of this book to read anywhere, on any device.
Multi-device progress sync: Pick up where you left off, on any device.
Use a DRM-free ePub version with your favorite e-reader.
Highlighting and notetaking: Capture ideas and turn reading into lasting knowledge.
Bookmarking: Save and revisit key sections whenever you need them.
Dark mode: Reduce eye strain by switching to dark or sepia themes.
Scan the QR code (or go to packtpub.com/unlock). Search for this book by name, confirm the edition, and then follow the steps on the page.
Note: Keep your invoice handy. Purchases made directly from Packt don’t require one
In this section, we will learn what the basic building blocks of a neural network are, and what the role of each block is, in order to successfully train a neural network. In this part, we will first briefly look at the theory of neural networks, before moving on to building and training neural networks with the PyTorch library.
This section comprises the following chapters:
Chapter 1, Artificial Neural Network FundamentalsChapter 2, PyTorch FundamentalsChapter 3, Building a Deep Neural Network with PyTorch