Building Agentic AI Systems - Anjanava Biswas - E-Book

Building Agentic AI Systems E-Book

Anjanava Biswas

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Gain unparalleled insights into the future of AI autonomy with this comprehensive guide to designing and deploying autonomous AI agents that leverage generative AI (GenAI) to plan, reason, and act. Written by industry-leading AI architects and recognized experts shaping global AI standards and building real-world enterprise AI solutions, it explores the fundamentals of agentic systems, detailing how AI agents operate independently, make decisions, and leverage tools to accomplish complex tasks.
Starting with the foundations of GenAI and agentic architectures, you’ll explore decision-making frameworks, self-improvement mechanisms, and adaptability. The book covers advanced design techniques, such as multi-step planning, tool integration, and the coordinator, worker, and delegator approach for scalable AI agents.
Beyond design, it addresses critical aspects of trust, safety, and ethics, ensuring AI systems align with human values and operate transparently. Real-world applications illustrate how agentic AI transforms industries such as automation, finance, and healthcare. With deep insights into AI frameworks, prompt engineering, and multi-agent collaboration, this book equips you to build next-generation adaptive, scalable AI agents that go beyond simple task execution and act with minimal human intervention.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 415

Veröffentlichungsjahr: 2025

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Building Agentic AI Systems

Create intelligent, autonomous AI agents that can reason, plan, and adapt

Anjanava Biswas

Wrick Talukdar

Building Agentic AI Systems

Copyright © 2025 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

The author acknowledges the use of cutting-edge AI, such as ChatGPT, with the sole aim of enhancing the language and clarity within the book, thereby ensuring a smooth reading experience for readers. It’s important to note that the content itself has been crafted by the author and edited by a professional publishing team.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Portfolio Director:Gebin George

Relationship Lead:Ali Abidi

Project Manager:Prajakta Naik

Content Engineer:Aditi Chatterjee

Technical Editor:Irfa Ansari

Copy Editor:Safis Editing

Proofreader:Aditi Chatterjee

Indexer:Pratik Shirodkar

Production Designer: Pranit Padwal and Vijay Kamble

Growth Lead: Kunal Sawant

First published: April 2025

Production reference: 2290525

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK.

ISBN 978-1-80323-875-3

www.packtpub.com

In loving memory of my father, whose guiding influence shapes every challenge I embrace. To my mother, my wife, and my son—this book exists because of your love and constant support. To my mentors and colleagues—thank you for lighting the way through this AI frontier. It’s been a privilege to learn from you and to share that journey.

– Anjanava Biswas

I dedicate this book to my parents, my son, my wife, and all my well-wishers—whose unwavering support and constant encouragement have been the foundation of my journey. Their belief in me has turned challenges into opportunities and dreams into accomplishments, making every milestone along the way all the more meaningful.

– Wrick Talukdar

Foreword

Artificial intelligence has always been driven by a profound aspiration: creating systems that not only perform predefined tasks but can also reason, adapt, and act with autonomy. With the rise of Generative AI and agentic systems, we are witnessing a transformative moment, one where artificial agents can reflect upon their experiences, strategize effectively, and collaborate seamlessly. This book arrives precisely at this pivotal juncture, offering a comprehensive exploration of this emerging and impactful field.

As someone with decades of experience in AI, spanning core research and product development as a technical leader at Microsoft Research and VC-backed AI startups, I’ve been deeply involved in high-impact AI research and the deployment of large-scale AI products that serve millions of users across both consumer and enterprise domains. My involvement in the IEEE in various AI leadership roles has also provided me with a unique perspective on the critical importance of ethical and responsible AI advancement.

I first met the authors, Wrick Talukdar and Anjanava Biswas, through our collaborative work on global IEEE AI initiatives. Their extensive expertise, demonstrated through their high-impact technical roles at AWS, work on AI and ML, as well as their scholarly AI research, is very impressive. Both Anjanava and Wrick are recognized leaders in Generative AI, bringing world-class expertise, dedication, and deep insights from architecting, developing and deploying cutting-edge AI platforms on a global scale. Their rich understanding of the technical complexities and ethical considerations uniquely qualifies them to author this important book.

Organized into three parts, the book initially lays a solid theoretical foundation by clearly defining and demystifying Generative AI and the essential characteristics of agentic systems. Readers are guided through core concepts, from Variational Autoencoders and Generative Adversarial Networks to sophisticated agent architectures, enabling both newcomers and seasoned professionals to navigate this complex landscape confidently.

In its second part, the book shifts focus to practical techniques necessary for developing intelligent agents capable of self-awareness, adaptability, and collaborative decision-making. The chapters on reflection, introspection, planning, and multi-agent collaboration provide invaluable methodologies and insights, empowering practitioners to design systems that dynamically learn and continuously improve.

The final section thoughtfully addresses the broader implications of deploying generative agentic systems in real-world contexts. By emphasizing transparency, explainability, safety, and ethical governance, this book underscores the essential responsibilities associated with advanced AI technologies. It provides a critical framework for fostering trust and ensuring these powerful tools align with societal values and ethical standards.

More than a technical reference, this book serves as an essential guide for shaping the future of Generative AI and intelligent agents. It invites researchers, industry leaders, and innovators to actively participate in crafting the next chapter of AI, one that responsibly advances technological boundaries while enriching society.

I wholeheartedly endorse this timely and insightful work. I encourage you to delve deep into the chapters that follow, embracing your role as an active contributor to the exciting and evolving frontier in artificial intelligence.

Matthew R. Scott,

Chief Technology Officer, Minset.ai

Spring 2025

Foreword 2

The world was caught by surprise when ChatGPT first launched in November 2022. It was an aha moment—suddenly, it felt like you were conversing with a real person. The responses weren’t just fluent; they were informative and useful. Since then, the technology has evolved at an astonishing pace, with hundreds of millions of people using ChatGPT and similar systems as writing assistants or to find answers that aren’t easily discoverable through traditional web searches.

However, when you ask a human assistant to book travel, they don’t just talk about it—they actually book the flights and hotel. This is the essence of agentic systems. Large language models (LLMs), like those used in ChatGPT, can be repurposed not just to generate text but to act—executing tasks by generating code, calling APIs, and interacting with web services. Just as the World Wide Web transitioned from serving static content to enabling interactive applications, agentic systems are poised to revolutionize how we interact with AI.

Unlike other books that focus primarily on the theory behind agentic systems, this book emphasizes real-world implementation. The authors provide hands-on code examples, practical use cases, and step-by-step guidance on integrating agentic workflows into organizations to solve real business challenges. This book serves as an accessible yet comprehensive resource for software developers, ML engineers, technical leads, directors, and other decision-makers looking to get up to speed on agentic systems.

Those who have used AI assistants like ChatGPT understand their value as writing and coding assistants, where users validate the output before taking action. However, AI can sometimes generate incorrect answers with such confidence that users may mistakenly trust them. Unlike humans, who often hedge their statements with phrases like “I’m not sure, but I think…”, AI systems lack built-in uncertainty indicators. We want to trust that our AI agents will take the correct actions and ask us when in doubt. This book delves into the crucial issues of trust, transparency, explainability, and reliability, as well as ethical considerations such as bias and safety. It describes how agentic systems can be personalized and made context aware, and how they can plan and react to the results of their actions.

As somebody that has been working on artificial intelligence for decades, I believe this book will be a great resource for students, researchers, and professionals alike, charting a clear path forward in an era profoundly influenced by intelligent systems. I first met Wrick Talukdar at an IEEE conference panel, and I was immediately impressed by his ability to bridge the gap between theory and practice. In Building Agentic AI Systems, he and Anjanava Biswas do just that—offering a practical guide to the future of AI-driven automation.

Dr. Alex Acero,

Member of the National Academy of Engineering,

IEEE Fellow

Contributors

About the authors

Anjanava Biswas is an award-winning senior AI specialist solutions architect with over 17 years of industry experience. Specializing in machine learning, generative AI, natural language processing, deep learning, data analytics, and cloud architecture, he partners with large enterprises to build and scale advanced AI systems in the cloud. Anjanava is widely recognized for his contributions to the field of applied AI. He has published research in multiple scientific journals and actively contributes to open-source AI/ML projects. His professional accolades include Fellowships with BCS (UK), the IET (UK), IETE (India), and he is a senior IEEE member. A frequent public speaker, Anjanava has held key positions at industry giants like IBM and Oracle Corp. Originally from India, he now resides in San Diego, CA, with his wife and son, where he continues to innovate and inspire within the tech community.

I would like to express my heartfelt gratitude to everyone who has accompanied me on this generative AI journey. To my colleagues, both past and present, who challenged my thinking and shared in the excitement of this rapidly evolving field—your collaboration has been invaluable. To my mentors, who saw potential in my ideas and guided me with wisdom and patience—your belief in me made this book possible.

To my remarkable son, whose eleven-year-old imagination reminds me daily that the future belongs to those who can dream it. Your questions force me to distill complexity into clarity—perhaps the most valuable skill in understanding generative AI. And to my wife, my steadfast companion through late nights and breakthrough moments—your patience has been the silent foundation upon which these ideas have been built.

To my mother, whose wisdom transcends technology and whose encouragement knows no bounds. And to my father, whose absence is felt deeply but whose influence remains in every challenge I choose to accept. I know you would have turned these pages with pride.

As Ada Lovelace once observed, “The Analytical Engine weaves algebraic patterns, just as the Jacquard loom weaves flowers and leaves.” Today’s generative AI continues this beautiful interplay between human creativity and computational possibility—a dance I have been privileged to document.

For everyone who believes that technology should elevate our humanity rather than diminish it—this book is for you.

– Anjanava Biswas

Wrick Talukdar is a visionary technology leader in generative artificial intelligence (AI) at Amazon, with over two decades of global experience in AI, cloud computing, and product leadership. A pioneer in AI-driven transformation, he has led large-scale modernization initiatives that drive enterprise growth and impact millions worldwide. He has spearheaded the productization of award-winning AI/ML technologies, now deployed at scale for Fortune 500 companies, shaping real-world AI applications.

A recognized figure in AI research, Wrick’s work in generative AI, multimodality, natural language processing, and computer vision is widely cited and referenced in the field. As a senior IEEE member, Chair, and panelist in multiple industry forums, he advises global committees like CTSoc Industry Forums and NIC, setting industry standards and shaping AI’s role for the future. He frequently presents his innovations at premier conferences such as World Technology Summit, IEEE HKN, ICCE, CERAWeek, and ADIPEC, bridging cutting-edge research with real-world AI applications to accelerate industry-wide innovation.

Deeply rooted in his computer science background, he co-chairs IEEE NIC to empower young professionals. As an author and thought leader, he continues to push AI’s boundaries, inspiring future innovators. Wrick lives in California with his family.

In the words of Marian Wright Edelman, "Education is for improving the lives of others and for leaving your community and world better than you found it."

I am profoundly grateful to everyone who has supported, guided, and inspired me throughout this journey.

To my family—my parents, my son, and my wife—your unwavering belief in me has been the bedrock of my strength.

To my son, whose endless curiosity about “what Dad is working on” reminded me every day why discovery matters—your wonder and questions sparked ideas when I needed them most, and your imagination reminded me of the limitless possibilities this work can unlock. You’ve taught me that inspiration often comes in the form of a simple question from someone who believes you can explain the universe.

To my parents, whose unwavering support, confidence, and hunger to understand a rapidly evolving world reminds me that knowledge is meant to be shared, not stored. Your belief in lifelong learning has been my moral compass, and your pride in my work has been one of my greatest motivators.

And to my wife, whose quiet strength, patience, and constant encouragement made this journey not only possible, but meaningful. Your presence through long nights, your unwavering faith through every obstacle, and your calming reassurance gave me the clarity to finish what I started. You’ve been my anchor and my inspiration.

To the AI research community—your relentless pursuit of innovation continues to inspire and humble me. After years of writing research papers, technical articles, and blogs, authoring my first book has been a transformative experience—one that challenged me to translate complexity into clarity and ideas into impact.

And to my friends and well-wishers—your presence through every challenge turned obstacles into milestones. Whether through a kind word, a shared insight, or simply showing up when it mattered most, you reminded me that no journey is ever truly walked alone.

This book is a tribute to each of you. You’ve walked with me, believed in my vision, and helped shape every word on these pages.

– Wrick Talukdar

About the reviewers

Manjit Chakraborty is a seasoned technology leader with extensive experience in driving digital transformation in the financial services sector. As an enterprise solutions architect at Amazon Web Services (AWS), he spearheads initiatives to modernize legacy systems and design innovative cloud-native solutions for the largest financial institutions across the globe. With a proven track record in business and technical architecture, Manjit excels in delivering actionable insights through data-driven analysis. His expertise spans diverse areas, including cloud migration, mainframe modernization strategies, system integrations, hybrid deployments, data analytics, and business intelligence. Manjit is a sought-after public speaker, having delivered presentations at numerous prestigious technical forums. He is an accomplished author and has also contributed to various technology publications, sharing his knowledge and insights with the broader tech community. Manjit is a distinguished reviewer, critically evaluating industry submissions and scholarly research papers on a regular basis, significantly influencing technological advancements and academic discourse.  He is based out of the Sunshine State, Florida, in the United States.

Martin Yanev is a highly accomplished software engineer with nearly a decade of experience across diverse industries, including aerospace and medical technology. Over his illustrious career, Martin has carved a niche for himself in developing and integrating cutting-edge software solutions for critical domains such as air traffic control and chromatography systems. Renowned as an esteemed instructor and computer science professor at Fitchburg State University, he possesses a deep understanding of the full spectrum of OpenAI APIs and exhibits mastery in constructing, training, and fine-tuning AI systems. As a widely recognized author, Martin has shared his expertise to help others navigate the complexities of AI development. With his exceptional track record and multifaceted skill set, Martin continues to propel innovation and drive transformative advancements in the field of software engineering.

Join our communities on Discord and Reddit

Have questions about the book or want to contribute to discussions on Generative AI and LLMs? Join our Discord server at https://packt.link/4Bbd9 and our Reddit channel at https://packt.link/wcYOQ to connect, share, and collaborate with like-minded enthusiasts.

Table of Contents

Preface

Part 1: Foundations of Generative AI and Agentic Systems

1

Fundamentals of Generative AI

Introduction to generative AI

Types of generative AI models

VAEs

GANs

Autoregressive models and Transformer architecture

LLM-powered AI agents

Applications of generative AI

Challenges and limitations of generative AI

Quality of data and bias

Data privacy

Computational resources

Ethical and social implications

Generalization and creativity

Summary

Questions

Answers

Further reading

References

2

Principles of Agentic Systems

Technical requirements

Understanding self-governance, agency, and autonomy

Self-governance

Agency

Autonomy

Example of agency and autonomy in agents

Reviewing intelligent agents and their characteristics

Exploring the architecture of agentic systems

Deliberative architectures

Reactive architectures

Hybrid architectures

Understanding multi-agent systems

Definition and characteristics of MASs

Interaction mechanisms in MASs

Summary

Questions

Answers

3

Essential Components of Intelligent Agents

Technical requirements

Knowledge representation in intelligent agents

Semantic networks

Frames

Logic-based representations

Reasoning in intelligent agents

Deductive reasoning

Inductive reasoning

Abductive reasoning

Learning mechanisms for adaptive agents

Decision-making and planning in agentic systems

Utility function

Planning algorithms

Enhancing agent capabilities with generative AI

Start building agentic AI

Summary

Questions

Answers

Part 2: Designing and Implementing Generative AI-Based Agents

4

Reflection and Introspection in Agents

Technical requirements

The importance of reflection in agents

Enhanced decision-making

Adaptation

Ethical consideration

Human-computer interaction

Introspection in intelligent agents

Implementing reflective capabilities

Traditional reasoning

Meta-reasoning

Self-explanation

Self-modeling

Use cases and examples

Customer service chatbots

Personal marketing agents

Financial trading systems

Forecast agents

Price strategies in e-commerce

Summary

Questions

Answers

5

Enabling Tool Use and Planning in Agents

Technical requirements

Understanding the concept of tool use in agents

Tool and function calling

Defining tools for agents

Types of tools

The significance of tools in agentic systems

Planning algorithms for agents

Less practical planning algorithms

Moderately practical planning algorithm – FF

Most practical planning algorithms

Integrating tool use and planning

Reasoning about tools

Planning for tool use

Exploring practical implementations

CrewAI example

AutoGen example

LangGraph example

Summary

Questions

Answers

6

Exploring the Coordinator, Worker, and Delegator Approach

Technical requirements

Understanding the CWD model

Key principles of the CWD model

The CWD model for the intelligent travel agent

Designing agents with role assignments

Roles and responsibilities of each agent

Communication and collaboration between agents

Communication

Coordination mechanism

Negotiation and conflict resolution

Knowledge sharing

Implementing the CWD approach in generative AI systems

System prompts and agent behavior

Instruction formatting

Interaction patterns

Summary

Questions

Answers

7

Effective Agentic System Design Techniques

Technical requirements

Focused system prompts and instructions for agents

Defining objectives

Task specifications

Contextual awareness

State spaces and environment modeling

State space representation

Environment modeling

Integration and interaction patterns

Monitoring and adaptation

Agent memory architecture and context management

Short-term memory (working memory)

Long-term memory (knowledge base)

Episodic memory (interaction history)

Context management

Integration with decision-making

Sequential and parallel processing in agentic workflows

Sequential processing

Parallel processing

Workflow optimization

Summary

Questions

Answers

Part 3: Trust, Safety, Ethics, and Applications

8

Building Trust in Generative AI Systems

Technical requirements

Importance of trust in AI

Techniques for establishing trust

Transparency and explainability

Dealing with uncertainty and biases

Effective output communication

User control and consent

Ethical development and responsibility

Implementing transparency and explainability

Handling uncertainty and biases

Summary

Questions

Answers

9

Managing Safety and Ethical Considerations

Understanding potential risks and challenges

Adversarial attacks

Bias and discrimination

Misinformation and hallucinations

Data privacy violations

Intellectual property risks

Ensuring safe and responsible AI

Exploring ethical guidelines and frameworks

Human-centric design

Accountability and responsibility

Privacy and data protection

Involvement of diverse stakeholders

Addressing privacy and security concerns

Summary

Questions

Answers

10

Common Use Cases and Applications

Creative and artistic applications

Evolution of creative and artistic agents

Real-world applications

Natural language processing and conversational agents

Evolution of language agents

Real-world applications

Robotics and autonomous systems

Evolution of robotic agents

Real-world applications

Decision support and optimization

Evolution of decision support agents

Real-world applications

Summary

Questions

Answers

11

Conclusion and Future Outlook

Recap of key concepts

Emerging trends and research directions

Multi-modal intelligence – integrating diverse inputs

Advanced language comprehension

Experiential learning – reinforcement learning innovations

Practical implications across industries

Artificial general intelligence

What makes AGI different

The big challenge

Learning to learn

Real-world understanding

Challenges and opportunities

Summary

Index

Other Books You May Enjoy

Preface

Building Agentic AI Systems is designed to provide both a theoretical foundation and practical guidance on generative AI and agent-based intelligence. Generative AI and agentic systems are at the forefront of the next wave of AI, driving automation, creativity, and decision-making in ways that were previously unimaginable. By enabling machines to generate text, images, and even strategic plans while reasoning and adapting autonomously, these technologies are transforming industries such as healthcare, finance, and robotics.

The book begins by introducing generative AI, covering key models such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), and autoregressive models. We explore their applications in content creation, design, and scientific research while addressing the limitations and challenges of these models.

Next, we dive into the world of agentic systems, defining concepts such as agency, autonomy, and multi-agent collaboration. We analyze different agent architectures—deliberative, reactive, and hybrid—and explore how multiple agents can interact, cooperate, and coordinate toward common goals.

Once the foundations are established, we move into practical implementation. We explore how agents can reflect on their own reasoning processes, plan, and use external tools effectively. This includes hands-on techniques for meta-reasoning, self-explanation, strategic planning, and multi-agent coordination. The book also introduces best practices for designing intelligent, trustworthy AI agents, balancing autonomy with control, and ensuring ethical and responsible AI development.

To conclude, we examine real-world use cases and applications across multiple domains, from NLP and robotics to decision support and optimization. We also explore trust, transparency, bias mitigation, and AI safety—key elements for ensuring the reliability of AI-driven systems.

Throughout this book, you will find code examples, practical exercises, and implementation strategies to help bridge the gap between theory and real-world application. Whether you are an AI practitioner, researcher, engineer, or technology leader, this book will equip you with the skills and knowledge to build autonomous, adaptive, and intelligent AI agents that can reason, collaborate, and evolve.

Let’s embark on this journey together, shaping the future of intelligent systems—one agent at a time.

Who this book is for

This book is intended for AI practitioners, developers, researchers, engineers, and technology leaders who want to understand and build AI-driven agents that exhibit autonomy, adaptability, and intelligence. Whether you are a developer looking to integrate generative models into intelligent systems or an AI architect exploring advanced agentic capabilities, this book will equip you with both theoretical foundations and hands-on implementation strategies.

What this book covers

Chapter 1, Fundamentals of Generative AI, introduces generative AI, explaining its core concepts, various model types—including VAEs, GANs, and autoregressive models—real-world applications, and challenges such as bias, limitations, and ethical concerns.

Chapter 2, Principles of Agentic Systems, defines agentic systems, covering agency, autonomy, and the essential characteristics of intelligent agents, including reactivity, proactiveness, and social ability. It also explores different agent architectures and multi-agent collaboration.

Chapter 3, Essential Components of Intelligent Agents, details key elements of intelligent agents, including knowledge representation, reasoning, learning mechanisms, decision-making, and the role of Generative AI in enhancing agent capabilities.

Chapter 4, Reflection and Introspection in Agents, explores how intelligent agents analyze their reasoning, learn from experience, and improve decision-making using techniques such as meta-reasoning, self-explanation, and self-modeling.

Chapter 5, Enabling Tool Use and Planning in Agents, discusses how agents leverage external tools, implement planning algorithms, and integrate tool use with strategic decision-making to improve efficiency and goal achievement.

Chapter 6, Exploring the Coordinator, Worker, and Delegator Approach, introduces the CWD model for multi-agent collaboration, explaining how agents take on specialized roles—coordinator, worker, or delegator—to optimize task execution and resource allocation.

Chapter 7, Effective Agentic System Design Techniques, covers best practices for designing intelligent agents, including focused instructions, setting guardrails and constraints, balancing autonomy and control, and ensuring transparency and accountability.

Chapter 8, Building Trust in Generative AI Systems, examines techniques for fostering trust in AI, including transparency, explainability, handling uncertainty and bias, and designing AI systems that are reliable and interpretable.

Chapter 9, Managing Safety and Ethical Considerations, addresses the risks and challenges of generative AI, strategies for ensuring responsible AI development, ethical guidelines, and privacy and security considerations for AI deployments.

Chapter 10, Common Use Cases and Applications, showcases real-world applications of Generative AI, covering areas such as creative content generation, conversational AI, robotics, and decision-support systems.

Chapter 11, Conclusion and Future Outlook, summarizes key concepts covered in the book, explores emerging trends in generative AI and agentic intelligence, discusses artificial general intelligence (AGI), and highlights future challenges and opportunities in the field.

To get the most out of this book

Following along will be a bit easier if you have the following:

Familiarity with AI and machine learning concepts: While the book covers foundational principles, prior knowledge of AI/ML, deep learning, and Python programming will help you understand the more advanced topics.Hands-on practice: Experiment with the provided code examples and frameworks for building Generative AI and agentic systems. Setting up a local or cloud-based development environment will enhance your learning experience.Think critically about AI ethics and safety: As you explore Generative AI and autonomous agents, consider the implications of trust, bias, and responsible AI design to build intelligent systems that align with ethical guidelines.

Software/hardware covered in the book

Operating system requirements

Python, Jupyter Notebooks, and CrewAI

Windows, macOS, Linux

Download the example code files

The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Building-Agentic-AI-Systems. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing. Check them out!

Disclaimer on images

Some images in this title are presented for contextual purposes, and the readability of the graphic is not crucial to the discussion. Please refer to our free graphic bundle to download the images. You can download the images from https://packt.link/gbp/9781803238753

Conventions used

There are a number of text conventions used throughout this book.

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Customized onboarding plan: Based on the goals and needs identified, create a bespoke onboarding plan that outlines the steps, milestones, and timelines toward achieving the set objectives.”

Tips or important notes

Appear like this.

Get in touch

Newsletter: To keep up with the latest developments in the fields of Generative AI and LLMs, subscribe to our weekly newsletter, AI_Distilled, at https://packt.link/Q5UyU.

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Building Agentic Systems with Generative AI, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily.

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/9781803238753

Submit your proof of purchaseThat’s it! We’ll send your free PDF and other benefits to your email directly

Part 1: Foundations of Generative AI and Agentic Systems

This part establishes the fundamental concepts of generative AI and agentic systems, providing you with a solid theoretical foundation for understanding and developing intelligent, autonomous agents.

This part contains the following chapters:

Chapter 1, Fundamentals of Generative AIChapter 2, Principles of Agentic SystemsChapter 3, Essential Components of Intelligent Agents

1

Fundamentals of Generative AI

Generative AI has quickly become a transformative technology in the field of artificial intelligence (AI) and machine learning, revolutionizing creative processes and problem-solving across diverse industries and use cases. It is pushing the boundaries of autonomy in agent-based intelligent systems. In this chapter, we will cover the basics of generative AI. We will explore what generative AI is along with a brief history of generative AI models. We will then discuss the different types of generative models, which include variational autoencoders (VAEs), generative adversarial networks (GANs), autoregressive, and Transformer models. Next, we will delve into the applications of generative AI and wrap up the chapter with a brief discussion of some of the limitations and challenges related to generative AI.

We will cover the following topics in this introductory chapter that will help set the foundations of generative AI as we explore further the capabilities of autonomous intelligent agents powered by generative AI:

Introduction to generative AITypes of generative AI modelsApplications of generative AIChallenges and limitations of generative AI

By the end of this chapter, you will have gained a comprehensive understanding of generative AI, including its fundamental concepts, diverse applications, and current challenges. You will also learn about the technology’s potential and limitations, with a particular focus on its critical role in advancing intelligent agents and agentic systems.

Introduction to generative AI

Generative AI refers to a class of AI technologies that is capable of generating various forms of content, including but not limited to text, images, audio, and video. These AI systems can generate new content, based on their training data and input parameters, which usually include text prompts but can also involve other forms of input such as images. The recent buzz around generative AI comes from the simplicity with which new user interfaces powered by this AI technology can create high-quality text, graphics, and videos in seconds.

In very simple terms, generative AI is about making new data that looks like the data from which it has been trained. In other words, learning the underlying patterns, structures, and distributions of input data enables a procedure within the model allowing it to generate new data in a similar way. For instance, if trained on a dataset of human faces, a generative AI model will be able to create completely new faces of people that don’t actually exist in real life but are very realistic. In essence, generative AI models work by learning the probability distribution of a dataset and then sampling from that distribution to create new instances. This approach is different from discriminative models, whose goal is to learn how to tell classes of data apart. For example, a discriminative model could be trained to separate pictures of cats and dogs, but a generative model would be trained to make up completely new pictures of cats or dogs.

The concept of generative AI goes a long way back; in fact, it dates back to the time when machine learning was in its cradle. However, it is only recently that this field has come under the limelight due to neural networks and drastic improvements in computing power over the last 10 years. In the early attempts at generative modeling—Gaussian mixture models and hidden Markov models—many simpler statistical techniques were involved. Methods of modeling arbitrary complex data distributions have much higher potential compatibility with the kinds of methods that were previously dominant but have recently been overcome by deep learning. Deep learning was therefore a watershed development for the field of generative AI. Models, such as VAEs, which started appearing around the beginning of the 2010s, were able to leverage deep neural networks to model more complex data distributions. Around that same period, GANs were proposed; for the first time, they offered a new way for two neural networks to work together to create data, using principles from game theory. In a general sense, these breakthroughs took leaps toward the possibilities of making real, high-quality, realistic creation of data.

This can lead to innovation in most fields across industries, such as healthcare, finance, education, travel and hospitality, manufacturing, and more. Generative AI in creative industries helps artists and designers think outside the box, helping with the conception of new and innovative content. It opens ways for new routes of drug discovery and personalized medicine in health. In the business world, generative AI works by providing efficient and effective customer engagement through personalized content and experiences. The other point involves questioning one’s own stance regarding creativity and originality: when machines are able to pump out very similar material, one is automatically forced to think about what creativity is, who the artist really is, and what the ethical parameters should be for AI-created content.

Having understood what generative AI is and its brief history, let’s explore the different types of generative AI models.

Types of generative AI models

Generative AI is an exciting domain of AI that deals with the generation of new, synthetic data by learning patterns from existing datasets, aiming to generate outputs that share similar statistical properties and characteristics with the training data. Here is a broad overview of some of the most prominent types of generative models: VAEs, GANs, and autoregressive models.

VAEs

One of the most popular generative models is the VAE. The core idea behind VAE consists of learning a probabilistic mapping between data and a latent space, and vice versa. This means learning how to convert real data into a simplified representation (such as a compressed form) and then back again into data that looks real. VAEs are designed to ensure a high likelihood of the data while preserving a well-structured latent space to enable the generation of new data samples similar to the training data. Some of the most common flavors of VAE are as follows:

VAE: The basic model that compresses and reconstructs data. It gives a framework for generating new samples out of the learned latent space. The VAE is a generative model that learns to compress and reconstruct data while also learning a probability distribution of the latent space. In simpler terms, a VAE is like a clever algorithm that learns to understand and recreate data. Imagine it as a skilled artist who can compress a detailed painting into a simple sketch (encoding) and then recreate the full painting from that sketch (decoding). What makes VAEs special is that they don’t just copy data but also learn the essence of it, allowing them to create new, similar data. This is like an artist learning not just to copy specific paintings but also to understand the style so well that they can paint new, original works in that style. In real life, VAEs have been used in many creative ways. For instance, in drug discovery, VAEs have been used to generate novel molecular structures. Researchers at AstraZeneca used VAEs to explore chemical space and propose new drug-like molecules with desired properties, potentially accelerating the drug development process [1].Beta-VAE: An extension of VAE that introduces a hyperparameter to control explicitly the trade-off between the reconstruction quality and the disentanglement of the latent space. The model undertakes the creation of more interpretable disentangled representations. In simpler terms, Beta-VAE is like a more flexible version of the original VAE. It allows researchers to adjust how much the model focuses on recreating exact details versus understanding the underlying features of the data. This is similar to teaching an art student to not just copy a painting but also to understand and separate the key elements such as color, shape, and style. This ability to disentangle features makes Beta-VAE particularly useful in fields such as computer vision and robotics. For example, researchers have used Beta-VAE to teach robots to understand objects better. By learning to separate features such as size, color, and position, robots can more easily recognize and manipulate objects in different situations, making them more adaptable and efficient in various tasks [2].Conditional variational autoencoder (CVAE): A variant that conditions the generation process on some extra information, typically class labels, producing not just data similar to the training set but also samples that can then be drawn from specific classes . Think of CVAE as a more controlled version of VAE, like an artist who can paint in different styles on demand. Instead of just creating random new paintings, this artist can be told to “Paint a landscape” or “Paint a portrait” and they’ll create new art in that specific category. This added control makes CVAE very useful in practical applications. For instance, in the field of computer games, CVAE has been used for game development and procedural content generation with the generation of game elements such as character design, level layouts, music and sound effects, and so on. By providing different conditions such as “Create a forest level” or “Create a desert level,” the CVAE can produce a wide variety of game environments, saving time for designers and enhancing the player’s experience with more diverse and interesting game worlds [3].

GANs

A GAN is basically formed by two neural networks: a generator and a discriminator. The generator generates synthetic data samples; the other trained neural network should then be able to tell the difference between real and created samples. While training these networks, they are trained together antagonistically: the generator tries to fool the discriminator, while the discriminator tries rightly to classify real versus fake data. In this competition, the generator gets better and better at faking data. The following are some of the different types of GANs:

GAN: The basic model in which the generator and discriminator are trained adversarially; it is the root model for most innovations in generative modeling. As explained earlier, you can imagine GAN as a game between two players—a counterfeiter (generator) trying to create fake money, and a detective (discriminator) trying to identify the fakes. As they compete, they both get better at their jobs, which means that the counterfeiter gets increasingly better at creating fake money. This clever setup allows GANs to create incredibly realistic fake data, such as images or sounds [4].Deep convolutional GAN (DCGAN): This is a refinement of the base GAN model with deep convolutional neural networks; at the moment, it is one of the best architectures for generating images of high quality. Think of DCGAN as a more sophisticated artist compared to the basic GAN. It’s like upgrading from simple sketching tools to a full digital art studio, allowing for the creation of much more detailed and realistic images. DCGANs are particularly good at understanding and recreating complex patterns in images [5].Wasserstein GAN (WGAN): This proposes a different loss function with the Wasserstein distance. This removes problems while training with a GAN, hence making training more stable and improving sample quality. This helps measure the distance between two probability distributions better. WGAN is like giving the artist (generator) and critic (discriminator) in a GAN a better way to communicate and evaluate each other’s work. Instead of just saying “good” or “bad,” they can now give more nuanced feedback, such as “You’re getting warmer” or “You’re way off.” This leads to more consistent improvement and high-quality results. In practical applications, WGANs have been used in medical imaging to generate synthetic medical images for training purposes. This helps in creating larger, more diverse datasets for training diagnostic AI systems, potentially improving their accuracy in detecting diseases from scans and X-rays [6].StyleGAN: This generates high-quality realistic images. The model is especially good at handling style and content separately. Progressions offered by the introductions of StyleGAN2 and StyleGAN3 are still aimed at improving image fidelity and realism. Think of StyleGAN as an advanced digital artist that can not only create realistic images but also mix and match different styles and contents. It’s like having a painter who can take the style of Van Gogh and apply it to a modern cityscape. This flexibility makes StyleGAN incredibly versatile [7].

Autoregressive models and Transformer architecture

Autoregressive models generate their data points one at a time, conditioning each of its data points on the previous one. Surprisingly, this ultimately proves very useful in tasks in which the sequence or structure of data is relevant, such as text generation and image generation. The Transformer architecture, introduced in the Attention Is All You Need paper [8], is a model architecture that has revolutionized many sequential data tasks, especially in natural language processing (NLP). Its key components are demonstrated in Figure 1.1:

Figure 1.1 – The Transformer architecture

These key components include the following:

Self-attention mechanism: A computational technique that allows the model to dynamically focus on different parts of the input when processing each elementMulti-head attention: Multiple attention mechanisms running in parallel, allowing the model to focus on different aspects of the input simultaneouslyPositional encoding: Adds information about the position of each element in the sequenceFeed-forward neural networks: Process the output of the attention layersLayer normalization and residual connections: Enhance training stability and information flow

Transformers can be used in both autoregressive and non-autoregressive configurations, making them versatile for various tasks. The following are some examples:

PixelCNN: Autoregressively generates images pixel by pixel, conditioning each pixel on the previously generated ones. This model works very well in generating high-quality images with fine details [9].PixelSNAIL: A refinement over PixelCNN that adds attention mechanisms into the model to capture strong dependencies between pixels, and hence, a better quality of an image [10].GPT: Standing for Generative Pre-trained Transformer, it is a line of models specializing in text generation—namely, predicting which words should come next. Among them are GPT-2, GPT-3, and GPT-4, making a huge leap in generating coherent and contextually relevant text [11] [12].BERT: Unlike GPT, BERT, which stands for Bidirectional Encoder Representations from Transformers, is designed to understand the context from both directions in text. It uses the encoder part of the Transformer and is typically used for tasks that require an understanding of the entire input at once, rather than generating text autoregressively [13].Text-To-Text Transfer Transformer (T5): This model frames all NLP tasks in a text-to-text format. It uses the full Transformer architecture (both encoder and decoder) and can handle various text generation tasks [14].

Although it depends on the task, these models demonstrate how the Transformer architecture can be adapted for both autoregressive (such as GPT) and non-autoregressive (such as BERT) tasks, showcasing its versatility in handling sequential data. Strengths differ across generative models. This class of models goes from generating photorealistic images to coherent text to totally new data samples.

Building upon the success of models such as GPT, researchers have scaled up these architectures to create large language models (LLMs). These models, often trained on vast amounts of open web text and other freely available text data, have demonstrated exceptional proficiency in understanding and generating human-like text across diverse applications, showcasing their versatility and advanced linguistic abilities. Examples of LLMs include GPT-3, GPT-4, PaLM, and BERT-large. These models have pushed the boundaries of what’s possible in NLP, showing proficiency in tasks ranging from question-answering and summarization to creative writing and code generation. The following is an overview of the common types of LLMs:

Autoregressive LLMs: These language models are typically capable of generating text sequentially, that is, one token at a time. Some of the common tasks that these types of models are used for include text generation, text completion tasks, and creative writing. Some popular examples of autoregressive LLMs are the GPT series (GPT-3, GPT-4), and PaLM.Encoder-only LLMs: These models specialize in natural language understanding (NLU) tasks, which involve analyzing and comprehending input text without generating new text. Encoder-only LLMs, such as BERT and its variants including RoBERTa, focus on understanding the context and meaning of input text. These models process the entire input sequence simultaneously, allowing them to capture the context helping with tasks such as text classification, named entity recognition (NER), and sentiment analysis.Encoder-decoder LLMs: This is a combination of encoder and decoder architecture where the models can not only understand the context of the input text (NLU) but are also capable of generating textual output. The T5 model discussed earlier is a very popular example of an encoder-decoder LLM, along with models such as BERT. These models are capable of language translation, summarization, and question-answering.Multimodal LLMs: Multimodality is a concept where an AI model can not only handle text but also other modalities such as image, video, and audio. Multimodal LLMs can process and generate content in various modalities such as text, image, audio, video, and any combination thereof. These models are rather novel in nature and recent advancements in compute have made it possible to train multimodal LLMs. Some of the well-known multimodal LLMs are DALL-E, Stable Diffusion, Flamingo, GPT-4, and LlaVa.Instruction-tuned LLMs: LLMs are often trained on a large corpus of openly available web or other text. As such, in certain scenarios, these models may not perform effectively for tasks that use data that the model has never seen before and thus may fail to follow the user intent. For example, a manufacturing manual for a proprietary instrument may contain domain-specific proprietary language, which the model may not have been exposed to during training. In such cases, LLMs can be fine-tuned using the domain-specific data to follow specific instructions and prompts (that is, text input to the model). This special fine-tuning enables the model to perform domain-specific tasks on which the model was fine-tuned, in addition to its other capabilities. Think of this as localizing a personal copy of a model to be good at performing a task that is meaningful for your use case. One of the common examples of such a model is InstructGPT, where researchers fine-tuned the GPT-3 model on specific instructions. These types of models are used in a multitude of different tasks such as chatbots, virtual assistants, and task-specific applications.Domain-specific LLMs: As mentioned earlier, while LLMs are often trained on a large set of open, generally available, web data, they may not perform very well in domain-specific environments. While instruction-tuned LLMs are an avenue to make these general-purpose models work for domain-specific tasks, researchers have innovated on training domain-specific LLMs from the ground up. This means that these models were pre-trained on highly specialized, domain knowledge. For example, the BioBERT model is trained on hyper-specialized medical data, whereas LegalBERT is trained on legal documents. These models are useful for specialized tasks in specific fields such as medicine, law, and finance.

LLM-powered AI agents

Each of the previously discussed LLM types has its own strengths and weaknesses and is well-suited for different applications. However, our focus in this book would be to dive deeper into LLM agent capabilities and explore how advanced systems can be built utilizing them. It is important to note that LLM agents do not fit neatly into a single category, as they represent an advanced application of LLM technology rather than a specific type of model architecture. Agents are typically built using instruction-tuned LLMs as their foundation, often combined with additional components and capabilities as you will learn further in this book.

Let us take a look at a simple example to demonstrate the capabilities of an agentic system powered by an LLM. Our example is a travel website that has a Flight Booking Assistant chatbot that is a conversational interface built with an LLM. It can help customers book travel tickets via human-like conversations. In the backend, the chatbot can perform various tasks as the conversation progresses. Here’s what a sample conversation may look like for a customer who is looking to book a flight ticket:

Flight Booking Assistant

Hi, I want to book a flight.

🤖

Hello! I’d be happy to help you book a flight. To get started, could you please tell me your departure city and destination?

I want to go from San Diego to SFO.

🤖

Great, thank you for providing that information. You’re looking to fly from San Diego (SAN) to San Francisco International Airport (SFO). To help you find the best flight options, I’ll need a few more details.

1. What is your preferred travel date?

2. Is this a one-way trip or a round trip?

3. How many passengers will be traveling?

I want to go next Friday and come back on Sunday. Just me traveling.

🤖

Thank you for the additional information. Let me summarize what I have so far:

- Departure: San Diego (SAN) to San Francisco (SFO)

- Outbound Date: Next Friday – 08/23/2024

- Return Date: Sunday – 08/25/2024

- Passengers: 1 adult

Before I search for flights, I have a few more questions:

1. Do you have a preferred time of day for your flights (morning, afternoon, or evening)?

2. Do you have a preferred airline or are you open to all airlines?

3. What class of service would you prefer (Economy, Premium Economy, Business, or First Class)?

I prefer morning flights. No airline preference. Economy is fine.

🤖

Thank you for those details. I’ll search for morning flights in Economy class across all airlines for your trip. Give me a moment to find the best options for you.

[searching flights...please stand by….]

🤖

I’ve found a few options for your trip. Here are the best morning flights:

Outbound (SAN to SFO) - Next Friday:

1. United Airlines UA1234: Depart 7:15 AM, Arrive 8:45 AM