34,79 €
spaCy is an industrial-grade, efficient NLP Python library. It offers various pre-trained models and ready-to-use features. Mastering spaCy provides you with end-to-end coverage of spaCy's features and real-world applications.
You'll begin by installing spaCy and downloading models, before progressing to spaCy's features and prototyping real-world NLP apps. Next, you'll get familiar with visualizing with spaCy's popular visualizer displaCy. The book also equips you with practical illustrations for pattern matching and helps you advance into the world of semantics with word vectors. Statistical information extraction methods are also explained in detail. Later, you'll cover an interactive business case study that shows you how to combine all spaCy features for creating a real-world NLP pipeline. You'll implement ML models such as sentiment analysis, intent recognition, and context resolution. The book further focuses on classification with popular frameworks such as TensorFlow's Keras API together with spaCy. You'll cover popular topics, including intent classification and sentiment analysis, and use them on popular datasets and interpret the classification results.
By the end of this book, you'll be able to confidently use spaCy, including its linguistic features, word vectors, and classifiers, to create your own NLP apps.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 371
An end-to-end practical guide to implementing NLP applications using the Python ecosystem
Duygu Altınok
BIRMINGHAM—MUMBAI
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Kunal Parikh
Publishing Product Manager: Ali Abidi
Senior Editor: Roshan Kumar
Content Development Editor: Tazeen Shaikh
Technical Editor: Sonam Pandey
Copy Editor: Safis Editing
Project Coordinator: Aparna Ravikumar Nair
Proofreader: Safis Editing
Indexer: Pratik Shirodkar
Production Designer: Joshua Misquitta
First published: July 2021
Production reference: 3211021
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80056-335-3
www.packt.com
To my mother, Ülker, for her life-long support and endless love. To my sister, for her support and inspiration. To my besties, Umutcan, Simge, and Aydan, for their friendship and support.
Duygu Altınok is a senior Natural Language Processing (NLP) engineer with 12 years of experience in almost all areas of NLP, including search engine technology, speech recognition, text analytics, and conversational AI. She has published several publications in the NLP area at conferences such as LREC and CLNLP. She also enjoys working on open source projects and is a contributor to the spaCy library. Duygu earned her undergraduate degree in computer engineering from METU, Ankara, in 2010 and later earned her master's degree in mathematics from Bilkent University, Ankara, in 2012. She is currently a senior engineer at German Autolabs with a focus on conversational AI for voice assistants. Originally from Istanbul, Duygu currently resides in Berlin, Germany, with her cute dog Adele.
Kevin Lu is currently a student studying software engineering at the University of Waterloo, with experience in full-stack web development, machine learning, computer vision, and natural language processing, and is the founder of the Python package PyATE (Python Automated Term Extraction). His interests include discrete mathematics, data science, algorithmic optimization, and deep learning. In the future, he is interested in pursuing research in NLP with deep learning and applications of it in accelerating academic research.
Usama Yaseen is currently a PhD candidate at Siemens AG (Munich) and the University of Munich. His research interests lie in data-efficient information extraction. Before starting his PhD, he was the lead data scientist at SAP SE, where he led a machine learning team focused on information extraction from semi-structured documents. He holds a master's from the Technical University of Munich in informatics; his master's thesis explored recurrent neural networks with external memory for question-answering systems. Overall, he has worked at Siemens (AG) (on corporate technology research), SAP SE (on machine learning), and Intel Corporation (on software development).
Souvik Roy is an NLP researcher. He primarily works on recurrent neural networks and transformer model compression methodologies such as pruning, quantization, tensor decomposition, and knowledge distillation to reduce the challenges faced by larger models, including longer training and inference times. He is passionate about working with textual data to solve underlying problems. Souvik has a master's in engineering from the University of Waterloo, specializing in text processing. Additionally, he has worked with Scribendi on document summarization and grammatical error correction. Since then, he has been working in diverse industrial research labs.
Carlos Fernando Schiaffin is passionate about analyzing and describing the underlying phenomena of human language. He is an NLP developer currently focused on conversational AI. He has a degree in linguistics and is a self-taught Python programmer. For more than five years, he has been working on NLP systems to try to understand and explain some of the speakers' linguistic behaviors. He started his career as a data tagger and soon went on to design annotation processes for linguistic data in Spanish, English, and Portuguese. Currently, he works with Rasa, spaCy and others, on the development of a conversational AI in Spanish. I thank Duygu Altinok for giving me the chance to participate in this book and my colleagues who always accompany my learning process.
This section will begin with an overview of natural language processing (NLP) with Python and spaCy. You will learn how the book is organized and how to make the best use of the book. You will then start by installing spaCy and its statistical models and take a quick dive into the spaCy world. Basic operations, general conventions, and visualization are the core attractions of this section.
This section comprises the following chapters:
Chapter 1, Getting Started with spaCyChapter 2, Core Operations with spaCy