Natural Language Understanding with Python - Deborah A. Dahl - E-Book

Natural Language Understanding with Python E-Book

Deborah A. Dahl

0,0
35,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Build advanced Natural Language Understanding Systems by acquiring data and selecting appropriate technology.


Key Features


Master NLU concepts from basic text processing to advanced deep learning techniques


Explore practical NLU applications like chatbots, sentiment analysis, and language translation


Gain a deeper understanding of large language models like ChatGPT


Book Description


Natural language understanding (NLU) organizes and structures, language allowing computer systems to effectively process textual information for many different practical applications. Natural Language Understanding with Python will help you explore practical techniques that make use of NLU to build a wide variety of creative and useful applications.


Complete with step-by-step explanations of essential concepts and practical examples, this book begins by teaching you about NLU and its applications. You’ll then explore a wide range of current NLU techniques and their most appropriate use-case. In the process, you’ll be introduced to the most useful Python NLU libraries. Not only will you learn the basics of NLU, you’ll also be introduced to practical issues such as acquiring data, evaluating systems, and deploying NLU applications, along with their solutions. This book is a comprehensive guide that will help you explore the full spectrum of essential NLU techniques and resources.


By the end of this book, you will be familiar with the foundational concepts of NLU, deep learning, and large language models (LLMs). You will be well on your way to having the skills to independently apply NLU technology in your own academic and practical applications.


What you will learn


The most important skill that readers will acquire is not just HOW to apply natural language techniques, but WHY to select particular techniques.


The book will also cover important practical considerations concerning acquiring real data and evaluating real system performance, not just performing textbook evaluations with pre-existing corpora


After reading this book and studying the code, readers will be equipped to build state of the art as well as practical natural language applications to solve real problems.


How to develop and fine-tune an NLP application


Maintaining NLP applications after deployment


Who this book is for


This book is for python developers, computational linguists, linguists, data scientists, NLP developers, conversational AI developers, and students looking to learn about natural language understanding (NLU) and applying natural language processing (NLP) technology to real problems. Anyone interested in addressing natural language problems will find this book useful. Working knowledge in Python is a must.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Seitenzahl: 490

Veröffentlichungsjahr: 2023

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Natural Language Understanding with Python

Combine natural language technology, deep learning, and large language models to create human-like language comprehension in computer systems

Deborah A. Dahl

BIRMINGHAM—MUMBAI

Natural Language Understanding with Python

Copyright © 2023 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Ali Abidi

Senior Editor: Tazeen Shaikh

Technical Editor: Rahul Limbachiya

Copy Editor: Safis Editing

Project Coordinator: Farheen Fathima

Proofreader: Safis Editing

Indexer: Rekha Nair

Production Designer: Joshua Misquitta

Marketing Coordinators: Shifa Ansari and Vinishka Kalra

First published: June 2023

Production reference: 1230623

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80461-342-9

www.packtpub.com

This book is dedicated to my grandchildren, Freddie and Matilda, two tireless explorers who never stop surprising me and whose endless curiosity is a continual inspiration.

Contributors

About the author

Deborah A. Dahl is the principal at Conversational Technologies, with over 30 years of experience in natural language understanding technology. She has developed numerous natural language processing systems for research, commercial, and government applications, including a system for NASA, and speech and natural language components on Android. She has taught over 20 workshops on natural language processing, consulted on many natural language processing applications for her customers, and written over 75 technical papers. This is Deborah’s fourth book on natural language understanding topics. Deborah has a PhD in linguistics from the University of Minnesota and postdoctoral studies in cognitive science from the University of Pennsylvania.

About the reviewers

Krishnan Raghavan is an IT professional with over 20+ years of experience in the area of software development and delivery excellence across multiple domains and technology, ranging from C++ to Java, Python, data warehousing, and big data tools and technologies.

When not working, Krishnan likes to spend time with his wife and daughter besides reading fiction, nonfiction, and technical books. Krishnan tries to give back to the community by being part of GDG, a Pune volunteer group, helping the team to organize events. Currently, he is unsuccessfully trying to learn how to play the guitar.

You can connect with Krishnan at [email protected] or via LinkedIn at www.linkedin.com/in/krishnan-raghavan.

I would like to thank my wife, Anita, and daughter, Ananya, for giving me the time and space to review this book.

Mannai Mortadha is a dedicated and ambitious individual known for his strong work ethic and passion for continuous learning. Born and raised in an intellectually stimulating environment, Mannai developed a keen interest in exploring the world through knowledge and innovation. Mannai’s educational journey began with a focus on computer science and technology. He pursued a degree in computer engineering, where he gained a solid foundation in programming, artificial intelligence algorithms, and system design. With a natural aptitude for problem-solving and an innate curiosity for cutting-edge technologies, Mannai excelled in his studies and consistently sought opportunities to expand his skills. Throughout his academic years, Mannai actively engaged in various extracurricular activities, including participating in hackathons and coding competitions. These experiences in Google, Netflix, Microsoft, and Talan Tunisie not only honed his technical abilities but also fostered his collaborative and teamwork skills.

Acknowledgment

I have worked as an independent consultant for a large part of my career. One of the benefits of working independently has been that I’ve been able to work with colleagues in many different organizations, which has exposed me to a rich set of technical perspectives that I would not have gotten from working at a single company or even a few companies.

I would like to express my gratitude to the colleagues, too many to mention individually, that I have worked with throughout my career, from my student days forward. These include colleagues at the University of Illinois, the University of Minnesota, the University of Pennsylvania, Unisys Corporation, MossRehab, Psycholinguistic Technologies, Autism Language Therapies, Openstream, the World Wide Web Consortium, the World Wide Web Foundation, the Applied Voice Input-Output Society, Information Today, New Interactions, the University Space Research Association, NASA Ames Research Center, and the Open Voice Network. I’ve learned so much from you.

Table of Contents

Preface

Part 1: Getting Started with Natural Language Understanding Technology

1

Natural Language Understanding, Related Technologies, and Natural Language Applications

Understanding the basics of natural language

Global considerations – languages, encodings, and translations

The relationship between conversational AI and NLP

Exploring interactive applications – chatbots and voice assistants

Generic voice assistants

Enterprise assistants

Translation

Education

Exploring non-interactive applications

Classification

Sentiment analysis

Spam and phishing detection

Fake news detection

Document retrieval

Analytics

Information extraction

Translation

Summarization, authorship, correcting grammar, and other applications

A summary of the types of applications

A look ahead – Python for NLP

Summary

2

Identifying Practical Natural Language Understanding Problems

Identifying problems that are the appropriate level of difficulty for the technology

Looking at difficult applications of NLU

Looking at applications that don’t need NLP

Training data

Application data

Taking development costs into account

Taking maintenance costs into account

A flowchart for deciding on NLU applications

Summary

Part 2:Developing and Testing Natural Language Understanding Systems

3

Approaches to Natural Language Understanding – Rule-Based Systems, Machine Learning, and Deep Learning

Rule-based approaches

Words and lexicons

Part-of-speech tagging

Grammar

Parsing

Semantic analysis

Pragmatic analysis

Pipelines

Traditional machine learning approaches

Representing documents

Classification

Deep learning approaches

Pre-trained models

Considerations for selecting technologies

Summary

4

Selecting Libraries and Tools for Natural Language Understanding

Technical requirements

Installing Python

Developing software – JupyterLab and GitHub

JupyterLab

GitHub

Exploring the libraries

Using NLTK

Using spaCy

Using Keras

Learning about other NLP libraries

Choosing among NLP libraries

Learning about other packages useful for NLP

Looking at an example

Setting up JupyterLab

Processing one sentence

Looking at corpus properties

Summary

5

Natural Language Data – Finding and Preparing Data

Finding sources of data and annotating it

Finding data for your own application

Finding data for a research project

Metadata

Generally available corpora

Ensuring privacy and observing ethical considerations

Ensuring the privacy of training data

Ensuring the privacy of runtime data

Treating human subjects ethically

Treating crowdworkers ethically

Preprocessing data

Removing non-text

Regularizing text

Spelling correction

Application-specific types of preprocessing

Substituting class labels for words and numbers

Redaction

Domain-specific stopwords

Remove HTML markup

Data imbalance

Using text preprocessing pipelines

Choosing among preprocessing techniques

Summary

6

Exploring and Visualizing Data

Why visualize?

Text document dataset – Sentence Polarity Dataset

Data exploration

Frequency distributions

Measuring the similarities among documents

General considerations for developing visualizations

Using information from visualization to make decisions about processing

Summary

7

Selecting Approaches and Representing Data

Selecting NLP approaches

Fitting the approach to the task

Starting with the data

Considering computational efficiency

Initial studies

Representing language for NLP applications

Symbolic representations

Representing language numerically with vectors

Understanding vectors for document representation

Representing words with context-independent vectors

Word2Vec

Representing words with context-dependent vectors

Summary

8

Rule-Based Techniques

Rule-based techniques

Why use rules?

Exploring regular expressions

Recognizing, parsing, and replacing strings with regular expressions

General tips for using regular expressions

Word-level analysis

Lemmatization

Ontologies

Sentence-level analysis

Syntactic analysis

Semantic analysis and slot filling

Summary

9

Machine Learning Part 1 – Statistical Machine Learning

A quick overview of evaluation

Representing documents with TF-IDF and classifying with Naïve Bayes

Summary of TF-IDF

Classifying texts with Naïve Bayes

TF-IDF/Bayes classification example

Classifying documents with Support Vector Machines (SVMs)

Slot-filling with CRFs

Representing slot-tagged data

Summary

10

Machine Learning Part 2 – Neural Networks and Deep Learning Techniques

Basics of NNs

Example – MLP for classification

Hyperparameters and tuning

Moving beyond MLPs – RNNs

Looking at another approach – CNNs

Summary

11

Machine Learning Part 3 – Transformers and Large Language Models

Technical requirements

Overview of transformers and LLMs

Introducing attention

Applying attention in transformers

Leveraging existing data – LLMs or pre-trained models

BERT and its variants

Using BERT – a classification example

Installing the data

Splitting the data into training, validation, and testing sets

Loading the BERT model

Defining the model for fine-tuning

Defining the loss function and metrics

Defining the optimizer and the number of epochs

Compiling the model

Training the model

Plotting the training process

Evaluating the model on the test data

Saving the model for inference

Cloud-based LLMs

ChatGPT

Applying GPT-3

Summary

12

Applying Unsupervised Learning Approaches

What is unsupervised learning?

Topic modeling using clustering techniques and label derivation

Grouping semantically similar documents

Applying BERTopic to 20 newsgroups

After clustering and topic labeling

Making the most of data with weak supervision

Summary

13

How Well Does It Work? – Evaluation

Why evaluate an NLU system?

Evaluation paradigms

Comparing system results on standard metrics

Evaluating language output

Leaving out part of a system – ablation

Shared tasks

Data partitioning

Evaluation metrics

Accuracy and error rate

Precision, recall, and F1

The receiver operating characteristic and area under the curve

Confusion matrix

User testing

Statistical significance of differences

Comparing three text classification methods

A small transformer system

TF-IDF evaluation

A larger BERT model

Summary

Part 3: Systems in Action – Applying Natural Language Understanding at Scale

14

What to Do If the System Isn’t Working

Technical requirements

Figuring out that a system isn’t working

Initial development

Fixing accuracy problems

Changing data

Restructuring an application

Moving on to deployment

Problems after deployment

Summary

15

Summary and Looking to the Future

Overview of the book

Potential for improvement – better accuracy and faster training

Better accuracy

Faster training

Other areas for improvement

Applications that are beyond the current state of the art

Processing very long documents

Understanding and creating videos

Interpreting and generating sign languages

Writing compelling fiction

Future directions in NLU technology and research

Quickly extending NLU technologies to new languages

Real-time speech-to-speech translation

Multimodal interaction

Detecting and correcting bias

Summary

Further reading

Index

Other Books You May Enjoy

Part 1: Getting Started with Natural Language Understanding Technology

In Part 1, you will learn about natural language understanding and its applications. You will also learn how to decide whether natural language understanding is applicable to a particular problem. In addition, you will learn about the relative costs and benefits of different NLU techniques.

This part comprises the following chapters:

Chapter 1, Natural Language Understanding, Related Technologies, and Natural Language ApplicationsChapter 2, Identifying Practical Natural Language Understanding Problems

1

Natural Language Understanding, Related Technologies, and Natural Language Applications

Natural language, in the form of both speech and writing, is how we communicate with other people. The ability to communicate with others using natural language is an important part of what makes us full members of our communities. The first words of young children are universally celebrated. Understanding natural language usually appears effortless, unless something goes wrong. When we have difficulty using language, either because of illness, injury, or just by being in a foreign country, it brings home how important language is in our lives.

In this chapter, we will describe natural language and the kinds of useful results that can be obtained from processing it. We will also situate natural language processing (NLP) within the ecosystem of related conversational AI technologies. We will discuss where natural language occurs (documents, speech, free text fields of databases, etc.), talk about specific natural languages (English, Chinese, Spanish, etc.), and describe the technology of NLP, introducing Python for NLP.

The following topics will be covered in this chapter:

Understanding the basics of natural languageGlobal considerationsThe relationship between conversational AI and NLPExploring interactive applicationsExploring non-interactive applicationsA look ahead – Python for NLP

Learning these topics will give you a general understanding of the field of NLP. You will learn what it can be used for, how it is related to other conversational AI topics, and the kinds of problems it can address. You will also learn about the many potential benefits of NLP applications for both end users and organizations.

After reading this chapter, you will be prepared to identify areas of NLP technology that are applicable to problems that you’re interested in. Whether you are an entrepreneur, a developer for an organization, a student, or a researcher, you will be able to apply NLP to your specific needs.

Understanding the basics of natural language

We don’t yet have any technologies that can extract the rich meaning that humans experience when they understand natural language; however, given specific goals and applications, we will find that the current state of the art can help us achieve many practical, useful, and socially beneficial results through NLP.

Both spoken and written languages are ubiquitous and abundant. Spoken language is found in ordinary conversations between people and intelligent systems, as well as in media such as broadcasts, films, and podcasts. Written language is found on the web, in books, and in communications between people such as emails. Written language is also found in the free text fields of forms and databases that may be available online but are not indexed by search engines (the invisible web).

All of these forms of language, when analyzed, can form the basis of countless types of applications. This book will lay the basis for the fundamental analysis techniques that will enable you to make use of natural language in many different applications.

Global considerations – languages, encodings, and translations

There are thousands of natural languages, both spoken and written, in the world, although the majority of people in the world speak one of the top 10 languages, according to Babbel.com (https://www.babbel.com/en/magazine/the-10-most-spoken-languages-in-the-world). In this book, we will focus on major world languages, but it is important to be aware that different languages can raise different challenges for NLP applications. For example, the written form of Chinese does not include spaces between words, which most NLP tools use to identify words in a text. This means that to process Chinese language, additional steps beyond recognizing whitespace are necessary to separate Chinese words. This can be seen in the following example, translated by Google Translate, where there are no spaces between the Chinese words:

Figure 1.1 – Written Chinese does not separate words with spaces, unlike most Western languages

Another consideration to keep in mind is that some languages have many different forms of the same word, with different endings that provide information about its specific properties, such as the role the word plays in a sentence. If you primarily speak English, you might be used to words with very few endings. This makes it relatively easy for applications to detect multiple occurrences of the same word. However, this does not apply to all languages.

For example, in English, the word walked can be used in different contexts with the same form but different meanings, such as I walked, they walked, or she has walked, while in Spanish, the same verb (caminar) would have different forms, such as Yo caminé, ellos caminaron, or ella ha caminado. The consequence of this for NLP is that additional preprocessing steps might be required to successfully analyze text in these languages. We will discuss how to add these preprocessing steps for languages that require them in Chapter 5.

Another thing to keep in mind is that the availability and quality of processing tools can vary greatly across languages. There are generally reasonably good tools available for major world languages such as Western European and East Asian languages. However, languages with fewer than 10 million speakers or so may not have any tools, or the available tools might not be very good. This is due to factors such as the availability of training data as well as reduced commercial interest in processing these languages.

Languages with relatively few development resources are referred to as low-resourced languages. For these languages, there are not enough examples of the written language available to train large machine learning models in standard ways. There may also be very few speakers who can provide insights into how the language works. Perhaps the languages are endangered, or they are simply spoken by a small population. Techniques to develop natural language technology for these languages are actively being researched, although it may not be possible or may be prohibitively expensive to develop natural language technology for some of these languages.

Finally, many widely spoken languages do not use Roman characters, such as Chinese, Russian, Arabic, Thai, Greek, and Hindi, among many others. In dealing with languages that use non-Roman alphabets, it’s important to recognize that tools have to be able to accept different character encodings. Character encodings are used to represent the characters in different writing systems. In many cases, the functions in text processing libraries have parameters that allow developers to specify the appropriate encoding for the texts they intend to process. In selecting tools for use with languages that use non-Roman alphabets, the ability to handle the required encodings must be taken into account.

The relationship between conversational AI and NLP

Conversational artificial intelligence is the broad label for an ecosystem of cooperating technologies that enable systems to conduct spoken and text-based conversations with people. These technologies include speech recognition, NLP, dialog management, natural language generation, and text-to-speech generation. It is important to distinguish these technologies, since they are frequently confused. While this book will focus on NLP, we will briefly define the other related technologies so that we can see how they all fit together:

Speech recognition: This is also referred to as speech-to-text or automatic speech recognition (ASR). Speech recognition is the technology that starts with spoken audio and converts it to text.NLP: This starts with written language and produces a structured representation that can be processed by a computer. The input written language can either be the result of speech recognition or text that was originally produced in written form. The structured format can be said to express a user’s intentor purpose.Dialog management: This starts with the structured output of NLP and determines how a system should react. System reactions can include such actions as providing information, playing media, or getting more information from a user in order to address their intent.Natural language generation: This is the process of creating textual information that expresses the dialog manager’s feedback to a user in response to their utterance.Text-to-speech: Based on the textural input created by the natural language generation process, the text-to-speech component generates spoken audio output when given text.

The relationships among these components are shown in the following diagram of a complete spoken dialog system. This book focuses on the NLP component. However, because many natural language applications use other components, such as speech recognition, text-to-speech, natural language generation, and dialog management, we will occasionally refer to them:

Figure 1.2 – A complete spoken dialog system

In the next two sections, we’ll summarize some important natural language applications. This will give you a taste of the potential of the technologies that will be covered in this book, and it will hopefully get you excited about the results that you can achieve with widely available tools.

Exploring interactive applications – chatbots and voice assistants

We can broadly categorize NLP applications into two categories, namely interactive applications, where the fundamental unit of analysis is most typically a conversation, and non-interactive applications, where the unit of analysis is a document or set of documents.

Interactive applications include those where a user and a system are talking or texting to each other in real time. Familiar interactive applications include chatbots and voice assistants, such as smart speakers and customer service applications. Because of their interactive nature, these applications require very fast, almost immediate, responses from a system because the user is present and waiting for a response. Users will typically not tolerate more than a couple of seconds’ delay, since this is what they’re used to when talking with other people. Another characteristic of these applications is that the user inputs are normally quite short, only a few words or a few seconds long in the case of spoken interaction. This means that analysis techniques that depend on having a large amount of text available will not work well for these applications.

An implementation of an interactive application will most likely need one or more of the other components from the preceding system diagram, in addition to NLP itself. Clearly, applications with spoken input will need speech recognition, and applications that respond to users with speech or text will require natural language generation and text-to-speech (if the system’s responses are spoken). Any application that does more than answer single questions will need some form of dialog management as well so that it can keep track of what the user has said in previous utterances, taking that information into account when interpreting later utterances.

Intent recognition is an important aspect of interactive natural language applications, which we will be discussing in detail in Chapter 9 and Chapter 14. An intent is essentially a user’s goal or purpose in making an utterance. Clearly, knowing what the user intended is central to providing the user with correct information. In addition to the intent, interactive applications normally have a requirement to also identify entities in user inputs, where entities are pieces of additional information that the system needs in order to address the user’s intent. For example, if a user says, “I want to book a flight from Boston to Philadelphia,” the intent would be make a flight reservation, and the relevant entities are the departure and destination cities. Since the travel dates are also required in order to book a flight, these are also entities. Because the user didn’t mention the travel dates in this utterance, the system should then ask the user about the dates, in a process called slot filling, which will be discussed in Chapter 8. The relationships between entities, intents, and utterances can be seen graphically in Figure 1.3:

Figure 1.3 – The intent and entities for a travel planning utterance

Note that the intent applies to the overall meaning of the utterance, but the entities represent the meanings of only specific pieces of the utterance. This distinction is important because it affects the choice of machine learning techniques used to process these kinds of utterances. Chapter 9, will go into this topic in more detail.

Generic voice assistants

The generic voice assistants that are accessed through smart speakers or mobile phones, such as Amazon Alexa, Apple Siri, and Google Assistant, are familiar to most people. Generic assistants are able to provide users with general information, including sports scores, news, weather, and information about prominent public figures. They can also play music and control the home environment. Corresponding to these functions, the kinds of intents that generic assistants recognize are intents such as get weather forecast for <location>, where <location> represents an entity that helps fill out the get weather forecast intent. Similarly, “What was the score for <team name> game?” has the intent get game score, with the particular team’s name as the entity. These applications have broad but generally shallow knowledge. For the most part, their interactions with users are just based on one or, at most, a couple of related inputs – that is, for the most part, they aren’t capable of carrying on an extended conversation.

Generic voice assistants are mainly closed and proprietary. This means that there is very little scope for developers to add general capabilities to the assistant, such as adding a new language. However, in addition to the aforementioned proprietary assistants, an open source assistant called Mycroft is also available, which allows developers to add capabilities to the underlying system, not just use the tools that the platforms provide.

Enterprise assistants

In contrast to the generic voice assistants, some interactive applications have deep information about a specific company or other organization. These are enterprise assistants. They’re designed to perform tasks specific to a company, such as customer service, or to provide information about a government or educational organization. They can do things such as check the status of an order, give bank customers account information, or let utility customers find out about outages. They are often connected to extensive databases of customer or product information; consequently, based on this information, they can provide deep but mainly narrow information about their areas of expertise. For example, they can tell you whether a particular company’s products are in stock, but they don’t know the outcome of your favorite sports team’s latest game, which generic assistants are very good at.

Enterprise voice assistants are typically developed with toolkits such as the Alexa Skills Kit, Microsoft LUIS, Google Dialogflow, or Nuance Mix, although there are open source toolkits such as RASA (https://rasa.com/). These toolkits are very powerful and easy to use. They only require developers to give toolkits examples of the intents and entities that the application will need to find in users’ utterances in order to understand what they want to do.

Similarly, text-based chatbots can perform the same kinds of tasks that voice assistants perform, but they get their information from users in the form of text rather than voice. Chatbots are becoming increasingly common on websites. They can supply much of the information available on the website, but because the user can simply state what they’re interested in, they save the user from having to search through a possibly very complex website. The same toolkits that are used for voice assistants can also be used in many cases to developtext-based chatbots.

In this book, we will not spend too much time on the commercial toolkits because there is very little coding needed to create usable applications. Instead, we’ll focus on the technologies that underly the commercial toolkits, which will enable developers to implement applications without relying on commercial systems.

Translation

The third major category of an interactive application is translation. Unlike the assistants described in the previous sections, translation applications are used to assist users to communicate with other people – that is, the user isn’t having a conversation with the assistant but with another person. In effect, the applications perform the role of an interpreter. The application translates between two different human languages in order to enable two people who don’t speak a common language to talk with each other. These applications can be based on either spoken or typed input. Although spoken input is faster and more natural, if speech recognition errors (which are common) occur, this can significantly interfere with the smoothness of communication between people.

Interactive translation applications are most practical when the conversation is about simple topics such as tourist information. More complex topics – for example, business negotiations – are less likely to be successful because their complexity leads to more speech recognition and translation errors.

Education

Finally, education is an important application of interactive NLP. Language learning is probably the most natural educational application. For example, there are applications that help students converse in a new language that they’re learning. These applications have advantages over the alternative of practicing conversations with other people because applications don’t get bored, they’re consistent, and users won’t be as embarrassed if they make mistakes. Other educational applications include assisting students with learning to read, learning grammar, or tutoring in any subject.

Figure 1.4 is a graphical summary of the different kinds of interactive applications and their relationships:

Figure 1.4 – A hierarchy of interactive applications

So far, we’ve covered interaction applications, where an end user is directly speaking to an NLP system, or typing into it, in real time. These applications are characterized by short user inputs that need quick responses. Now, we will turn to non-interactive applications, where speech or text is analyzed when there is no user present. The material to be analyzed can be arbitrarily long, but the processing time does not have to be immediate.

Exploring non-interactive applications

The other major type of natural language application is non-interactive, or offline applications. The primary work done in these applications is done by an NLP component. The other components in the preceding system diagram are not normally needed. These applications are performed on existing text, without a user being present. This means that real-time processing is not necessary because the user isn’t waiting for an answer. Similarly, the system doesn’t have to wait for the user to decide what to say so that, in many cases, processing can occur much more quickly than in the case of an interactive application.

Classification

A very important and widely used class of non-interactive natural language applications is document classification, or assigning documents to categories based on their content. Classification has been a major application area in NLP for many years and has been addressed with a wide variety of approaches.

One simple example of classification is a web application that answers customers’ frequently asked questions (FAQs) by classifying a query into one of a set of given categories and then providing answers that have been previously prepared for each category. For this application, a classification system would be a better solution than simply allowing customers to select their questions from a list because an application could sort questions into hundreds of FAQ categories automatically, saving the customer from having to scroll through a huge list of categories. Another example of an interesting classification problem is automatically assigning genres to movies – for example, based on reviews or plot summaries.

Sentiment analysis

Sentiment analysis is a specialized type of classification where the goal is to classify texts such as product reviews into those that express positive and negative sentiments. It might seem that just looking for positive or negative words would work for sentiment analysis, but in this example, we can see that despite many negative words and phrases (concern, break, problem, issues, send back, and hurt my back), the review is actually positive:

“I was concerned that this chair, although comfortable, might break before I had it for very long because the legs were so thin. This didn’t turn out to be a problem. I thought I might have to send it back. I haven’t had any issues, and it’s the one chair I have that doesn’t hurt my back.”

More sophisticated NLP techniques, taking context into account, are needed to recognize that this is a positive review. Sentiment analysis is a very valuable application because it is difficult for companies to do this manually if there are thousands of existing product reviews and new product reviews are constantly being added. Not only do companies want to see how their products are viewed by customers, but it is also very valuable for them to know how reviews of competing products compare to reviews of their own products. If there are dozens of similar products, this greatly increases the number of reviews relevant to the classification. A text classification application can automate a lot of this process. This is a very active area of investigation in the academic NLP community.

Spam and phishing detection

Spam detection is another very useful classification application, where the goal is to sort email messages into messages that the user wants to see and spam that should be discarded. This application is not only useful but also challenging because spammers are constantly trying to circumvent spam detection algorithms. This means that spam detection techniques have to evolve along with new ways of creating spam. For example, spammers often misspell keywords that might normally indicate spam by substituting the numeral 1 for the letter l, or substituting the numeral 0 for the letter o. While humans have no trouble reading words that are misspelled in this way, keywords that the computer is looking for will no longer match, so spam detection techniques must be developed to find these tricks.

Closely related to spam detection is detecting messages attempting to phish a user or get them to click on a link or open a document that will cause malware to be loaded onto their system. Spam is, in most cases, just an annoyance, but phishing is more serious, since there can be extremely destructive consequences if the user clicks on a phishing link. Any techniques that improve the detection of phishing messages will, therefore, be very beneficial.

Fake news detection

Another very important classification application is fake news detection. Fake news refers to documents that look very much like real news but contain information that isn’t factual and is intended to mislead readers. Like spam detection and phishing detection, fake news detection is challenging because people who generate fake news are actively trying to avoid detection. Detecting fake news is not only important for safeguarding reasons but also from a platform perspective, as users will begin to distrust platforms that consistently report fake news.

Document retrieval

Document retrieval is the task of finding documents that address a user’s search query. The best example of this is a routine web search of the kind most of us do many times a day. Web searches are the most well-known example of document retrieval, but document retrieval techniques are also used in finding information in any set of documents – for example, in the free-text fields of databases or forms.

Document retrieval is based on finding good matches between users’ queries and the stored documents, so analyzing both users’ queries and documents is required. Document retrieval can be implemented as a keyword search, but simple keyword searches are vulnerable to two kinds of errors. First, keywords in a query might be intended in a different sense than the matching keywords in documents. For example, if a user is looking for a new pair of glasses, thinking of eyeglasses, they don’t want to see results for drinking glasses. The other type of error is where relevant results are not found because keywords don’t match. This might happen if a user uses just the keyword glasses, and results that might have been found with the keywords spectacles or eyewear might be missed, even if the user is interested in those. Using NLP technology instead of simple keywords can help provide more precise results.

Analytics

Another important and broad area of natural language applications is analytics. Analytics is an umbrella term for NLP applications that attempt to gain insights from text, often the transcribed text from spoken interactions. A good example is looking at the transcriptions of interactions between customers and call center agents to find cases where the agent was confused by the customer’s question or provided wrong information. The results of analytics can be used in the training of call center agents. Analytics can also be used to examine social media posts to find trending topics.

Information extraction

Information extraction is a type of application where structured information, such as the kind of information that could be used to populate a database, is derived from text such as newspaper articles. Important information about an event, such as the date, time, participants, and locations, can be extracted from texts reporting news. This information is quite similar to the intents and entities discussed previously when we talked about chatbots and voice assistants, and we will find that many of the same processing techniques are relevant to both types of applications.

An extra problem that occurs in information extraction applications is named entity recognition (NER), where references to real people, organizations, and locations are recognized. In extended texts such as newspaper articles, there are often multiple ways of referring to the same individual. For example, Joe Biden might be referred to as the president, Mr. Biden, he, or even the former vice-president. In identifying references to Joe Biden, an information extraction application would also have to avoid misinterpreting a reference to Dr. Biden as a reference to Joe Biden, since that would be a reference to his wife.

Translation

Translation between languages, also known as machine translation, has been one of the most important NLP applications since the field began. Machine translation hasn’t been solved in general, but it has made enormous progress in the past few years. Familiar web applications such as Google Translate and Bing Translate usually do a very good job on text such as web pages, although there is definitely room for improvement.

Machine translation applications such as Google and Bing are less effective on other types of text, such as technical text that contains a great deal of specialized vocabulary or colloquial text of the kind that might be used between friends. According to Wikipedia (https://en.wikipedia.org/wiki/Google_Translate), Google Translate can translate 109 languages. However, it should be kept in mind that the accuracy for the less widely spoken languages is lower than that for the more commonly spoken languages, as discussed in the Global considerations section.

Summarization, authorship, correcting grammar, and other applications

Just as there are many reasons for humans to read and understand texts, there are also many applications where systems that are able to read and understand text can be helpful. Detecting plagiarism, correcting grammar, scoring student essays, and determining the authorship of texts are just a few. Summarizing long texts is also very useful, as is simplifying complex texts. Summarizing and simplifying text can also be applied when the original input is non-interactive speech, such as podcasts, YouTube videos, or broadcasts.

Figure 1.5 is a graphical summary of the discussion of non-interactive applications:

Figure 1.5 – A hierarchy of non-interactive applications

Figure 1.5 shows how the non-interactive NLP applications we’ve been discussing are related to each other. It’s clear that classification is a major application area, and we will look at it in depth in Chapter 9, Chapter 10, and Chapter 11.

A summary of the types of applications

In the previous sections, we saw how the different types of interactive and non-interactive applications we have discussed relate to each other. It is apparent that NLP can be applied to solving many different and important problems. In the rest of the book, we’ll dive into the specific techniques that are appropriate for solving different kinds of problems, and you’ll learn how to select the most effective technologies for each problem.

A look ahead – Python for NLP

Traditionally, NLP has been accomplished with a variety of computer languages, from early, special-purpose languages, such as Lisp and Prolog, to more modern languages, such as Java and now Python. Currently, Python is probably the most popular language for NLP, in part because interesting applications can be implemented relatively quickly and developers can rapidly get feedback on the results of their ideas.

Another major advantage of Python is the very large number of useful, well-tested, and well-documented Python libraries that can be applied to NLP problems. Some of these libraries are NLTK, spaCy, scikit-learn, and Keras, to name only a few. We will be exploring these libraries in detail in the chapters to come. In addition to these libraries, we will also be working with development tools such as JupyterLab. You will also find other resources such as Stack Overflow and GitHub to be extremely valuable.

Summary

In this chapter, we learned about the basics of natural language and global considerations. We also looked at the relationship between conversational AI and NLP and explored interactive and non-interactive applications.

In the next chapter, we will be covering considerations concerning selecting applications of NLP. Although there are many ways that this technology can be applied, some possible applications are too difficult for the state of the art. Other applications that seem like good applications for NLP can actually be solved by simpler technologies. In the next chapter, you will learn how to identify these.

2

Identifying Practical Natural Language Understanding Problems

In this chapter, you will learn how to identify natural language understanding (NLU) problems that are a good fit for today’s technology. That means they will not be too difficult for the state-of-the-art NLU approaches but neither can they be addressed by simple, non-NLU approaches. Practical NLU problems also require sufficient training data. Without sufficient training data, the resulting NLU system will perform poorly. The benefits of an NLU system also must justify its development and maintenance costs. While many of these considerations are things that project managers should think about, they also apply to students who are looking for class projects or thesis topics.

Before starting a project that involves NLU, the first question to ask is whether the goals of the project are a good fit for the current state of the art in NLU. Is NLU the right technology for solving the problem that you wish to address? How does the difficulty of the problem compare to the NLU state of the art?

Starting out, it’s also important to decide what solving the problem means. Problems can be solved to different degrees. If the application is a class project, demo, or proof of concept, the solution does not have to be as accurate as a deployed solution that’s designed for the robust processing of thousands of user inputs a day. Similarly, if the problem is a cutting-edge research question, any improvement over the current state of the art is valuable, even if the problem isn’t completely solved by the work done in the project. How complete the solution has to be is a question that everyone needs to decide as they think about the problem that they want to address.

The project manager, or whoever is responsible for making the technical decisions about what technologies to use, should decide what level of accuracy they would find acceptable when the project is completed, keeping in mind that 100% accuracy is unlikely to be achievable in any natural language technology application.

This chapter will get into the details of identifying problems where NLU is applicable. Follow the principles discussed in this chapter, and you will be rewarded with a quality, working system that solves a real problem for its users.

The following topics are covered in this chapter:

Identifying problems that are the appropriate level of difficulty for the technologyLooking at difficult NLU applicationsLooking at applications that don’t need NLPTraining dataApplication dataTaking development costs into accountTaking maintenance costs into accountA flowchart for deciding on NLU applications

Identifying problems that are the appropriate level of difficulty for the technology

Note

This chapter is focused on technical considerations. Questions such as whether a market exists for a proposed application, or how to decide whether customers will find it appealing, are important questions, but they are outside of the scope of this book.

Here are some kinds of problems that are a good fit for the state of the art.

Today’s NLU is very good at handling problems based on specific, concrete topics, such as these examples:

Classifying customers’ product reviews into positive and negative reviews: Online sellers typically offer buyers a chance to review products they have bought, which is helpful for other prospective buyers as well as for sellers. But large online retailers with thousands of products are then faced with the problem of what to do with the information from thousands of reviews. It’s impossible for human tabulators to read all the incoming reviews, so an automated product review classification system would be very helpful.Answering basic banking questions about account balances or recent transactions: Banks and other financial institutions have large contact centers that handle customer questions. Often, the most common reasons for calling are simple questions about account balances, which can be answered with a database lookup based on account numbers and account types. An automated system can handle these by asking callers for their account numbers and the kind of information they need.Making simple stock trades: Buying and selling stock can become very complex, but in many cases, users simply want to buy or sell a certain number of shares of a specific company. This kind of transaction only needs a few pieces of information, such as an account number, the company, the number of shares, and whether to buy or sell.Package tracking: Package tracking needs only a tracking number to tell users the status of their shipments. While web-based package tracking is common, sometimes, people don’t have access to the web. With a natural language application, users can track packages with just a phone call.Routing customers’ questions to the right customer service agent: Many customers have questions that can only be answered by a human customer service agent. For those customers, an NLU system can still be helpful by directing the callers to the call center agents in the right department. It can ask the customer the reason for their call, classify the request, and then automatically route their call to the expert or department that handles that topic.Providing information about weather forecasts, sports scores, and historical facts: These kinds of applications are characterized by requests that have a few well-defined parameters. For sports scores, this would be a team name and possibly the date of a game. For weather forecasts, the parameters include the location and timeframe for the forecast.

All of these applications are characterized by having unambiguous, correct answers. In addition, the user’s language that the system is expected to understand is not too complex. These would all be suitable topics for an NLU project.

Let’s illustrate what makes these applications suitable for today’s technology by going into more detail on providing information about weather forecasts, sports scores, and historical facts.

Figure 2.1 shows a sample architecture for an application that can provide weather forecasts for different cities. Processing starts when the user asks, What is the weather forecast for tomorrow in New York City? Note that the user is making a single, short request, for specific information – the weather forecast, for a particular date, in a particular location. The NLU system needs to detect the intent (weather forecast), the entities’ location, and the date. These should all be easy to find – the entities are very dissimilar, and the weather forecast