31,19 €
The Rasa framework enables developers to create industrial-strength chatbots using state-of-the-art natural language processing (NLP) and machine learning technologies quickly, all in open source.
Conversational AI with Rasa starts by showing you how the two main components at the heart of Rasa work – Rasa NLU (natural language understanding) and Rasa Core. You'll then learn how to build, configure, train, and serve different types of chatbots from scratch by using the Rasa ecosystem. As you advance, you'll use form-based dialogue management, work with the response selector for chitchat and FAQ-like dialogs, make use of knowledge base actions to answer questions for dynamic queries, and much more. Furthermore, you'll understand how to customize the Rasa framework, use conversation-driven development patterns and tools to develop chatbots, explore what your bot can do, and easily fix any mistakes it makes by using interactive learning. Finally, you'll get to grips with deploying the Rasa system to a production environment with high performance and high scalability and cover best practices for building an efficient and robust chat system.
By the end of this book, you'll be able to build and deploy your own chatbots using Rasa, addressing the common pain points encountered in the chatbot life cycle.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 275
Veröffentlichungsjahr: 2021
Build, test, and deploy AI-powered, enterprise-grade virtual assistants and chatbots
Xiaoquan Kong
Guan Wang
BIRMINGHAM—MUMBAI
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Devika Battike
Senior Editor: Mohammed Yusuf Imaratwale
Content Development Editor: Nazia Shaikh
Technical Editor: Devanshi Ayare
Copy Editor: Safis Editing
Project Coordinator: Aparna Ravikumar Nair
Proofreader: Safis Editing
Indexer: Sejal Dsilva
Production Designer: Joshua Misquitta
First published: October 2021
Production reference: 1260821
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80107-705-7
www.packt.com
To my parents, for their unwavering devotion. To my wife, for her support behind the scenes. In addition, thanks to Google for providing the Google Cloud credits to support this work.
– Xiaoquan Kong
To my mom and dad. To my wife and kids. Thank you!
– Guan Wang
Conversational AI combines ideas from linguistics, human-computer interaction, artificial intelligence, and machine learning to develop voice and chat assistants for a near-infinite set of use cases. Since 2016 there has been a surge in interest in this field, driven by the widespread adoption of mobile chat applications. The coronavirus pandemic accelerated this trend, with almost all one-on-one interactions becoming digital.
2016 was also the year Rasa was first released and we saw the first community contributions come in on GitHub. Open source communities live and die by their users and contributors, and this is doubly true for Rasa, where our global community builds assistants in hundreds of human languages. Xiaoquan Kong and Guan Wang have been leading members of our community for years and I am grateful for their many contributions. Not least Xiaoquan's efforts to ensure Rasa has robust support for building assistants in Mandarin. I've been eagerly awaiting the publication of this book.
Conversational AI with Rasa covers precisely the topics required to become proficient at building real-world applications with Rasa. Aside from covering the fundamentals of natural language understanding and dialogue management, the book emphasizes the real-world context of building great products. In the first chapter, you are challenged to think whether a conversational experience is even the right one to build. The book also covers the essential process of Conversation-Driven Development, without which many assistants get built but fail to serve their intended users. Additionally, readers are taught practical skills like debugging an assistant, writing tests, and deploying an assistant to production.
This book will be of great use for anyone starting out as a Rasa developer, and I'm sure many existing Rasa developers will discover things they didn't know.
Alan Nichol
Co-founder and CTO, Rasa
Xiaoquan Kong is a machine learning expert specializing in NLP applications. He has extensive experience of leading teams to build NLP platforms for several Fortune Global 500 companies. He is a Google Developer Expert in machine learning and has been actively involved in contributing to TensorFlow for many years. He also has actively contributed to the development of the Rasa framework since the early stages and became a Rasa Superhero in 2018. He manages the Rasa Chinese community and has also participated in the Chinese localization of TensorFlow documents as a technical reviewer.Guan Wang is currently working on Al applications and research for the insurance industry. Prior to that, he worked as a machine learning researcher for several industry Al labs. He was raised and educated in mainland China and lived in Hong Kong for 10 years before relocating to Singapore in 2020. Guan holds BSc degrees in physics and computer science from Peking University, and an. MPhil degree in physics from HKUST. Guan is an active tech blogger and community contributor to open source projects including Rasa, receiving more than 10,000 stars for his own projects on GitHub.
Harin Joshi's journey in chabot development started with an internship at ImpactGuru, India's fourth largest crowdfunding platform. There he developed two chatbots and a machine learning module. He was awarded Intern of the Month for this. Thereafter, he associated with the Co-learning Lounge AI community and developed a chatbot as educational content. Currently, he is working for the QuickGHY start-up as a chatbot developer.
I would like to thank my parents for always being there no matter what. Moreover, I am very grateful to have friends, who stood strong when I needed them at different stages of my life. And lastly, I would thank all the readers of this book: you are definitely going to learn a lot about Rasa and its functionalities.
Pratik Kotian is an conversational AI engineer with 5 years of experience in building conversational AI agents and designing products related to conversational design. He is working as a machine learning engineer (specializing in conversational AI) at Quantiphi, which is an AI company and recognized Google Partner. He has also worked with Packt on reviewing The Tensorflow Workshop.
I would like to thank my family and friends, who are always supportive and have always believed in me and my talents. It's because of them that I am doing well in my career and helping others to build great conversational bots.
The Rasa framework enables developers to create industrial-strength chatbots using state-of-the-art natural language processing (NLP) and machine learning technologies quickly, all in open source.
Conversational AI with Rasa starts by showing you how the two main components at the heart of Rasa work – Rasa NLU and Rasa Core. You’ll then learn how to build, configure, train, and serve different types of chatbots from scratch by using the Rasa ecosystem. As you advance, you’ll use form-based dialogue management, work with the response selector for chitchat and FAQ-like dialogues, make use of knowledge base actions to answer questions for dynamic queries, and more. Furthermore, you’ll understand how to customize the Rasa framework, use conversation-driven development patterns and tools to develop chatbots, explore what your bot can do, and easily fix any mistakes it makes by using interactive learning. Finally, you’ll get to grips with deploying the Rasa system to a production environment with high performance and high scalability and cover best practices for building an efficient and robust chat system.
By the end of this book, you’ll be able to build and deploy your own chatbots using Rasa, addressing the common pain points encountered in the chatbot life cycle.
This book is for NLP professionals and machine learning and deep learning practitioners who have knowledge of NLP and want to build chatbots with Rasa. Anyone with beginner-level knowledge of NLP and deep learning will be able to get the most out of the book.
Chapter 1, Introduction to Chatbots and the Rasa Framework, introduces all the fundamental knowledge pertaining to chatbots and the Rasa framework, including machine learning, NLP, chatbots, and Rasa Basic.
Chapter 2, Natural Language Understanding in Rasa, covers Rasa NLU’s architecture, configuration methods, and how to train and infer.
Chapter 3, Rasa Core, introduces how to implement dialogue management in Rasa.
Chapter 4, Handling Business Logic, explains how Rasa gives developers great flexibility in handling different business logic. This chapter introduces how we can use these features to handle complex business logic more elegantly and efficiently.
Chapter 5, Working with Response Selector to Handle Chitchat and FAQs, explains how to define questions and their corresponding answers and how to configure Rasa to automatically identify the query and give the corresponding answer.
Chapter 6, Knowledge Base Actions to Handle Question Answering, describes how to create a knowledge base that will be used to answer questions. You will also learn to customize knowledge base actions, learn how referential resolution (mapping mention to object) works, and how to create your own knowledge base.
Chapter 7, Entity Roles and Groups for Complex Named Entity Recognition, explains how entity roles and entity groups solve the complex NER problem, and how to define training data, configure pipelines, and write stories for entity roles and entity groups.
Chapter 8, Working Principles and Customization of Rasa, introduces the working principles behind Rasa and how we can extend and customize Rasa.
Chapter 9, Testing and Production Deployment, explains how to test Rasa applications and how to deploy Rasa applications in production environments.
Chapter 10, Conversation-Driven Development and Interactive Learning, introduces conversation-driven development and Rasa X to develop chatbots more effectively. We will also introduce how to use interactive learning to quickly find and fix problems.
Chapter 11, Debugging, Optimization, and Community Ecosystem, explains how to debug and optimize Rasa applications. We will also introduce some tools to help developers build chatbots effectively.
You will need a version of Rasa 2.x installed on your computer—the latest version if possible. All code examples have been tested using Rasa 2.8.1 on Ubuntu 20.04 LTS. However, they should work with future version releases, too.
You should install Rasa with the following command: pip install rasa[transformers]. This command will install the transformers library, which provides the components we need in the code.
You will also need to install the pyowm Python package to run the code present in Chapter 4, Handling Business Logic. You will also need to install Docker and the neo4j Python package 4.1 to run the code of the custom knowledge base part in Chapter 6, Knowledge Base Actions to Handle Question Answering.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section).
The versions of Rasa change quickly, and the related knowledge base and documents are also rapidly updated. We recommend that you frequently read Rasa’s documentation to understand the changes.
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Conversational-AI-with-RASA. If there’s an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781801077057_ColorImages.pdf.
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The following example demonstrates post-mortem debugging using the pdb command."
A block of code is set as follows:
version: "2.0"
language: en
pipeline:
- name: WhitespaceTokenizer
- name: LanguageModelFeaturizer
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
WebChat.default.init({ selector: "#webchat", initPayload: "Hello",
Any command-line input or output is written as follows:
python -m pdb -c continue <XXX>/rasa/__main__.py train
Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "Click on the Cancel button."
Tips or important notes
Appear like this.
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Once you've read Conversational AI with Rasa, we'd love to hear your thoughts! Please https://packt.link/r/1801077053 for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.
In this section, you will learn about the core concepts of machine learning, natural language processing, dialogue systems, and Rasa. All these foundational concepts will prepare you for subsequent learning.
This section comprises the following chapters:
Chapter 1, Introduction to Chatbots and the Rasa FrameworkChapter 2, Natural Language Understanding in RasaChapter 3, Rasa CoreIn this first chapter, we will introduce chatbots and the Rasa framework. Knowledge of these is important because they will be used in later chapters. We will split that fundamental knowledge into four pieces, of which the first three are machine learning (ML), natural language processing (NLP), and chatbots. This is the theory and concept part of the fundamentals. With these in place, you will know in theory how to build a chatbot.
The last piece is Rasa basics. We will introduce the key technology of this book: the Rasa framework and its basic usage.
In particular, we will cover the following topics:
What is ML?Introduction to NLPChatbot basicsIntroduction to the Rasa frameworkRasa is a Python-based framework. To install it, you need a Python developer environment, which can be downloaded from https://python.org/downloads/. At the time of writing this chapter, Rasa only supports Python 3.6, 3.7, and 3.8, so please be careful to choose the correct Python version when you set up the developing environment.
You can find all the code for this chapter in the ch01 directory of the GitHub repository, at https://github.com/PacktPublishing/Conversational-AI-with-RASA.
ML and artificial intelligence (AI) have almost become buzzwords in recent years. Everyone must have heard about AI in the news after AlphaGo from Google beat the best Go player in the world. There is no doubt that ML is now one of the most popular and advanced areas of research and applications. So, what exactly is ML?
Let's imagine that we are building an application to automatically recognize rock/paper/scissors based on video inputs from a camera. The hand gesture from the user will be recognized by the computer as one of rock/paper/scissors.
Let's look at the differences between ML and traditional programming in solving this problem.
In traditional programming, the working process usually goes like this:
Software development: Product managers and software engineers work together to understand business requirements and transform them into detailed business rules. Then, software engineers write the code to transform business rules into computer programs. This stage is shown as process 1 in the following diagram. Software usage: Computer software transforms users' input to output. This stage is shown as process 2 in the following diagram:Figure 1.1 – Traditional programming working pattern
Let's go back to our rock/paper/scissors application. If we use a traditional programming methodology, it will be very difficult to recognize the position of hands and boundaries of the fingers, not to mention that even the same gesture can evolve into many different representations, including the position of the hand, different sizes and shapes of hands and fingers, different skin colors, and so on. In order to solve all these problems, the source code will be very cumbersome, the logic will become very complicated, and it will become almost impossible to maintain and update the solution. In reality, probably no one can accomplish their target with traditional programming methodology.
On the other hand, in ML, the working process usually follows this pattern:
Software development: The ML algorithm infers hidden business rules by learning from training data and encodes the business rules into models with lots of weight parameters. Process 1 in the following diagram shows the data flow.Software usage: The model transforms users' input to output. In the following diagram, process 2 corresponds to this stage:Figure 1.2 – Programming working pattern driven by ML
There are a few types of ML algorithms: supervised learning (SL), unsupervised learning (UL), and reinforcement learning (RL). In NLP, the most useful and most common algorithms belong to SL, so let's focus on this learning algorithm.
An SL algorithm builds a mathematical model of a set of data that contains both the inputs (x) and the expected outputs (y). The algorithm's input data is also known as training data, composed of a set of training examples. The SL algorithm learns a function or a mapping from inputs to outputs of training data. Such a function or mapping is called a model. A model can be used to predict outputs associated with new inputs.
The algorithm used for our rock/paper/scissors application is an SL algorithm. More specifically, this is a classification task. Classification is a task that requires algorithms to learn how to assign (limited) class labels to examples—for example, classifying emails as "spam" or "non-spam" is a classification task. More specifically, it divides data into two categories, so it is a binary classification task. The rock/paper/scissors application in this example divides the picture into three categories, so, to be more specific, it belongs to a multi-class classification task. The opposite of a classification task is a regression task, which predicts a continuous quantity output for each example—for example, predicting future house prices in a certain area is a regression task.
Our application's training data contains the data (the image) and a label (one of rock/paper/scissors), which are the input and output (I/O) of the SL algorithm. The data consists of many pictures. As the example in the following screenshot shows, each picture is simply a big matrix of pixel values for the algorithm to consume, and the label of the picture is rock or paper or scissors for the hand gesture in the picture:
Figure 1.3 – Data and label
Now we understand what an SL algorithm is, in the next section, we will cover the general process of ML.
There are three basic stages of applying ML algorithms: training, inference, and evaluation. Let's look at these stages in more detail here:
Training stage: The training stage is when the algorithms learn knowledge or business rules from training data. As shown in process 1 in Figure 1.2, the input of the training stage is training data, and the output of the training stage is the model.Inference stage: The inference stage is when we use a model to compute the output label of a new input data. The input of this stage is the new input data without labels, and the output is the most likely label.Evaluation stage: In a serious application, we always want to know how good a model is before we use it in production. This is a stage called evaluation. The evaluation stage will measure the model's performance in various ways and can help users to compare models.In the next section, we will introduce how to measure model performance.
In NLP, most problems can be viewed as classification problems. A key concept in classification performance is a confusion matrix, on which almost all other performance metrics are based.
A confusion matrix is a table of the model predictions versus the ground-truth labels.
Let me give you a specific example. Assume we are building a binary classification to classify whether an image is a cat image or not. When the image is a cat image, we call it a positive. Remember—we are building an application to detect cats, so a cat image is a positive result for our system, and if it is not a cat image (in our case, it's a dog image), we call it a negative. Our test data has 10 images. The real label of test data is listed as follows, where the cat image represents a cat and the dog image represents a dog:
Figure 1.4 – The real label of test data
The prediction result of our model is shown here:
Figure 1.5 – The prediction result of our model on test data
The confusion matrix of our case would look like this:
Figure 1.6 – The confusion matrix of our case
In this confusion matrix, there are five cat images, and the model predicts that one of them is a dog. This is an error, and we call it a false negative (FN) because the model says it is a negative result, but that is actually incorrect. And in the five dog images, the model predicts that two of these are cats. This is another error, and we call it a false positive (FP) because the model says it is a positive result but it's actually incorrect. All correct predictions belong to one of two cases: cats-to-cats prediction, which we call a true positive (TP), and dogs-to-dogs prediction, which we call a true negative (TN).
So, the preceding confusion matrix can be viewed as an instance of the following abstract confusion matrix:
Figure 1.7 – The confusion matrix in abstract terms
Many important performance metrics are derived from a confusion matrix. Here, we will introduce some of the most important ones, as follows:
Accuracy (ACC):Recall:Precision:F1 score:Among the preceding metrics, the F1 score is the combined advantage of recall and precision, so it is the most commonly used metric for now.
In the next section, we will talk about the root cause of poor performance (the performance metrics being low): overfitting and underfitting.
Generally speaking, there are two types of errors found in ML models: overfitting and underfitting.
When a model performs poorly on the training data, we call it underfitting. Common reasons that can lead to underfitting include the following:
The algorithm is too simple. It does not have enough power to capture the complexity of the training data. For algorithms based on neural networks, there are too few hidden layers.The network architecture or features used for training is not suitable for the task—for example, models based on bag-of-words (BoW) are not suitable for complex NLP tasks. In these tasks, the order of words is critical, but a BoW model completely discards this information.Training a model for too few epochs (a full training pass over the entire training data so that each example has been seen once) or at too low a learning rate (a scalar used to train a model via gradient descent, which can determine the degree of weight changes).Using a too-high regularization rate (a scale used to indicate the penalty degree on a model's complexity; the penalty can reduce the power of fitting) to train a model.When a model performs very well on the training data but performs poorly on new data that it has never seen before, we call this overfitting. Overfitting means the algorithm has the ability to fit the training data well, but it does not generalize well to samples that are not in the training data. Generalization is the most important key feature of ML. It means that algorithms learn some key concepts from training data rather than just simply remembering them. When overfitting happens, it shows that the model is more likely to remember what it saw in training than learn from it, so it performs very well on the training data, but since it does not see the new data before and does not learn the concept well, it thus performs poorly on the new data. ML scientists have already developed various methods against overfitting, such as adding more training data, regularization, dropout, and stopping early.
In the next section, we will introduce TL, which is very useful when the training data is insufficient (this is a common situation).
TL is a method where a model can use knowledge from another model for another task.
TL is popular in the chatbot domain. There are many reasons for this, and some of them are listed here:
TL needs less training data: In a chatbot domain, there usually is not much training data. When using a traditional ML method to train a model, it usually does not perform well due to a lack of training data. With TL, we can achieve much better performance on the same amount of training data. The less data you have, the more performance increase you can get. TL makes training faster: TL only needs a few training epochs to fine-tune a model for a new task. Generally, it is much faster than the traditional ML method and makes the whole development process more efficient.Now we understand what ML is, in the next section, we will cover the basics of NLP.
NLP is a subfield of linguistics and ML, concerned with interactions between computers and humans via text or speech.
Let's start with a brief history of NLP.
Before 2013, there was no unified method for NLP. This was because two problems had not been solved well.
The first problem relates to how we represent textual information during the computing process.
Time-series data such as voices can be represented as signals and waves. Image information gives pixel position and pixel value. However, there were no intuitive ways to digitalize text. There were some preliminary methods such as one-hot encoding to represent each word or phrase and use BoW to represent sentences and paragraphs, but it became quite obvious that this was not the perfect way to deal with this.
After one-hot encoding, the dimension of each vector will be the size of the entire vocabulary, with all 0 values except one value of 1, to represent the position of that word. Such sparse vectors waste a lot of space and, in the meantime, give no indication of the semantic meaning of the word itself—every pair of two different words will always be orthogonal to each other.
A BoW model simply counts the frequency of each word that appears in the text and ignores the dependency and order of the words in the context.
The second problem relates to how we can build models for text.
