Mastering Transformers - Savaş Yıldırım - E-Book

Mastering Transformers E-Book

Savaş Yıldırım

0,0
29,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Transformer-based language models have dominated natural language processing (NLP) studies and have now become a new paradigm. With this book, you'll learn how to build various transformer-based NLP applications using the Python Transformers library.
The book gives you an introduction to Transformers by showing you how to write your first hello-world program. You'll then learn how a tokenizer works and how to train your own tokenizer. As you advance, you'll explore the architecture of autoencoding models, such as BERT, and autoregressive models, such as GPT. You'll see how to train and fine-tune models for a variety of natural language understanding (NLU) and natural language generation (NLG) problems, including text classification, token classification, and text representation. This book also helps you to learn efficient models for challenging problems, such as long-context NLP tasks with limited computational capacity. You'll also work with multilingual and cross-lingual problems, optimize models by monitoring their performance, and discover how to deconstruct these models for interpretability and explainability. Finally, you'll be able to deploy your transformer models in a production environment.
By the end of this NLP book, you'll have learned how to use Transformers to solve advanced NLP problems using advanced models.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 360

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Mastering Transformers

Build state-of-the-art models from scratch with advanced natural language processing techniques

Savaş Yıldırım

Meysam Asgari-Chenaghlu

BIRMINGHAM—MUMBAI

Mastering Transformers

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Publishing Product Manager: Aditi Gour

Senior Editor: David Sugarman

Content Development Editor: Nathanya Dias

Technical Editor: Arjun Varma

Copy Editor: Safis Editing

Project Coordinator: Aparna Ravikumar Nair

Proofreader: Safis Editing

Indexer: Tejal Daruwale Soni

Production Designer: Alishon Mendonca

First published: September 2021

Production reference: 1290721

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80107-765-1

www.packt.com

Contributors

About the authors

Savaş Yıldırım graduated from the Istanbul Technical University Department of Computer Engineering and holds a Ph.D. degree in Natural Language Processing (NLP). Currently, he is an associate professor at the Istanbul Bilgi University, Turkey, and is a visiting researcher at the Ryerson University, Canada. He is a proactive lecturer and researcher with more than 20 years of experience teaching courses on machine learning, deep learning, and NLP. He has significantly contributed to the Turkish NLP community by developing a lot of open source software and resources. He also provides comprehensive consultancy to AI companies on their R&D projects. In his spare time, he writes and directs short films, and enjoys practicing yoga.

First of all, I would like to thank my dear partner, Aylin Oktay, for her continuous support, patience, and encouragement throughout the long process of writing this book. I would also like to thank my colleagues at the Istanbul Bilgi University, Department of Computer Engineering, for their support.

Meysam Asgari-Chenaghlu is an AI manager at Carbon Consulting and is also a Ph.D. candidate at the University of Tabriz. He has been a consultant for Turkey's leading telecommunication and banking companies. He has also worked on various projects, including natural language understanding and semantic search.

First and foremost, I would like to thank my loving and patient wife, Narjes Nikzad-Khasmakhi, for her support and understanding. I would also like to thank my father for his support; may his soul rest in peace. Many thanks to Carbon Consulting and my co-workers.

About the reviewer

Alexander Afanasyev is a software engineer with about 14 years of experience in a variety of different domains and roles. Currently, Alexander is an independent contractor who pursues ideas in the space of computer vision, NLP, and building advanced data collection systems in the cyber threat intelligence domain. Previously, Alexander helped review the Selenium Testing Cookbook book by Packt. Outside of daily work, he is an active contributor to Stack Overflow and GitHub.

I would like to thank the authors of this book for their hard work and for providing innovative content; the wonderful team of editors and coordinators with excellent communication skills; and my family, who was and always is supportive of my ideas and my work.

Table of Contents

Preface

Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications

Chapter 1: From Bag-of-Words to the Transformer

Technical requirements

Evolution of NLP toward Transformers

Understanding distributional semantics

BoW implementation

Overcoming the dimensionality problem

Language modeling and generation

Leveraging DL

Learning word embeddings

A brief overview of RNNs

LSTMs and gated recurrent units

A brief overview of CNNs

Overview of the Transformer architecture

Attention mechanism

Multi-head attention mechanisms

Using TL with Transformers

Summary

References

Chapter 2: A Hands-On Introduction to the Subject

Technical requirements

Installing Transformer with Anaconda

Installation on Linux

Installation on Windows

Installation on macOS

Installing TensorFlow, PyTorch, and Transformer

Installing using Google Colab

Working with language models and tokenizers

Working with community-provided models

Working with benchmarks and datasets

Important benchmarks

Accessing the datasets with an Application Programming Interface

Benchmarking for speed and memory

Summary

Section 2: Transformer Models – From Autoencoding to Autoregressive Models

Chapter 3: Autoencoding Language Models

Technical requirements

BERT – one of the autoencoding language models

BERT language model pretraining tasks

A deeper look into the BERT language model

Autoencoding language model training for any language

Sharing models with the community

Understanding other autoencoding models

Introducing ALBERT

RoBERTa

ELECTRA

Working with tokenization algorithms

Byte pair encoding

WordPiece tokenization

Sentence piece tokenization

The tokenizers library

Summary

Chapter 4: Autoregressive and Other Language Models

Technical requirements

Working with AR language models

Introduction and training models with GPT

Transformer-XL

XLNet

Working with Seq2Seq models

T5

Introducing BART

AR language model training

NLG using AR models

Summarization and MT fine-tuning using simpletransformers

Summary

References

Chapter 5: Fine-Tuning Language Models for Text Classification

Technical requirements

Introduction to text classification

Fine-tuning a BERT model for single-sentence binary classification

Training a classification model with native PyTorch

Fine-tuning BERT for multi-class classification with custom datasets

Fine-tuning the BERT model for sentence-pair regression

Utilizing run_glue.py to fine-tune the models

Summary

Chapter 6: Fine-Tuning Language Models for Token Classification

Technical requirements

Introduction to token classification

Understanding NER

Understanding POS tagging

Understanding QA

Fine-tuning language models for NER

Question answering using token classification

Summary

Chapter 7: Text Representation

Technical requirements

Introduction to sentence embeddings

Cross-encoder versus bi-encoder

Benchmarking sentence similarity models

Using BART for zero-shot learning

Semantic similarity experiment with FLAIR

Average word embeddings

RNN-based document embeddings

Transformer-based BERT embeddings

Sentence-BERT embeddings

Text clustering with Sentence-BERT

Topic modeling with BERTopic

Semantic search with Sentence-BERT

Summary

Further reading

Section 3: Advanced Topics

Chapter 8: Working with Efficient Transformers

Technical requirements

Introduction to efficient, light, and fast transformers

Implementation for model size reduction

Working with DistilBERT for knowledge distillation

Pruning transformers

Quantization

Working with efficient self-attention

Sparse attention with fixed patterns

Learnable patterns

Low-rank factorization, kernel methods, and other approaches

Summary

References

Chapter 9: Cross-Lingual and Multilingual Language Modeling

Technical requirements

Translation language modeling and cross-lingual knowledge sharing

XLM and mBERT

mBERT

XLM

Cross-lingual similarity tasks

Cross-lingual text similarity

Visualizing cross-lingual textual similarity

Cross-lingual classification

Cross-lingual zero-shot learning

Fundamental limitations of multilingual models

Fine-tuning the performance of multilingual models

Summary

References

Chapter 10: Serving Transformer Models

Technical requirements

fastAPI Transformer model serving

Dockerizing APIs

Faster Transformer model serving using TFX

Load testing using Locust

Summary

References

Chapter 11: Attention Visualization and Experiment Tracking

Technical requirements

Interpreting attention heads

Visualizing attention heads with exBERT

Multiscale visualization of attention heads with BertViz

Understanding the inner parts of BERT with probing classifiers

Tracking model metrics

Tracking model training with TensorBoard

Tracking model training live with W&B

Summary

References

Why subscribe?

Other Books You May Enjoy

Section 1: Introduction – Recent Developments in the Field, Installations, and Hello World Applications

In this section, you will learn about all aspects of Transformers at an introductory level. You will write your first hello-world program with Transformers by loading community-provided pre-trained language models and running the related code with or without a GPU. Installing and utilizing the tensorflow, pytorch, conda, transformers, and sentenceTransformers libraries will also be explained in detail in this section.

This section comprises the following chapters:

Chapter 1, From Bag-of-Words to the TransformersChapter 2, A Hands-On Introduction to the Subject