Getting Started with Google BERT - Sudharsan Ravichandiran - E-Book

Getting Started with Google BERT E-Book

Sudharsan Ravichandiran

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

BERT (bidirectional encoder representations from transformer) has revolutionized the world of natural language processing (NLP) with promising results. This book is an introductory guide that will help you get to grips with Google's BERT architecture. With a detailed explanation of the transformer architecture, this book will help you understand how the transformer’s encoder and decoder work.
You’ll explore the BERT architecture by learning how the BERT model is pre-trained and how to use pre-trained BERT for downstream tasks by fine-tuning it for NLP tasks such as sentiment analysis and text summarization with the Hugging Face transformers library. As you advance, you’ll learn about different variants of BERT such as ALBERT, RoBERTa, and ELECTRA, and look at SpanBERT, which is used for NLP tasks like question answering. You'll also cover simpler and faster BERT variants based on knowledge distillation such as DistilBERT and TinyBERT. The book takes you through MBERT, XLM, and XLM-R in detail and then introduces you to sentence-BERT, which is used for obtaining sentence representation. Finally, you'll discover domain-specific BERT models such as BioBERT and ClinicalBERT, and discover an interesting variant called VideoBERT.
By the end of this BERT book, you’ll be well-versed with using BERT and its variants for performing practical NLP tasks.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 384

Veröffentlichungsjahr: 2021

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Getting Started with Google BERT
Build and train state-of-the-art natural language processing models using BERT
Sudharsan Ravichandiran
BIRMINGHAM - MUMBAI

Getting Started with Google BERT

Copyright © 2021 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh Publishing Product Manager: Devika BattikeContent Development Editor:Sean LoboSenior Editor: Roshan KumarTechnical Editor:Manikandan KurupCopy Editor:Safis EditingProject Coordinator:Aishwarya MohanProofreader: Safis EditingIndexer:Priyanka DhadkeProduction Designer: Prashant Ghare

First published: January 2021

Production reference: 1210121

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-83882-159-3

www.packt.com

To my adorable mom, Kasthuri, and to my beloved dad, Ravichandiran.

- Sudharsan Ravichandiran

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

About the author

Sudharsan Ravichandiran is a data scientist, researcher, and bestselling author. He completed his bachelor's in information technology at Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning, including natural language processing and computer vision. He is an open source contributor and loves answering questions on Stack Overflow. He also authored a best seller, Hands-On Reinforcement Learning with Python, published by Packt Publishing.

I would like to thank my most amazing parents and my brother, Karthikeyan, for inspiring and motivating me. I would like to thank the Packt team, Devika, Sean, and Kirti, for their great help. Without all of their support, it would have been impossible to complete this book.

About the reviewers

Dr. ArmandoFandango creates AI-empowered products by leveraging reinforcement learning, deep learning, and distributed computing. Armando has provided thought leadership in diverse roles at small and large enterprises, including Accenture, Nike, Sonobi, and IBM, along with advising high-tech AI-based start-ups. Armando has authored several books, including Mastering TensorFlow, TensorFlow Machine Learning Projects, and Python Data Analysis, and has published research in international journals and presented his research at conferences. Dr. Armando’s current research and product development interests lie in the areas of reinforcement learning, deep learning, edge AI, and AI in simulated and real environments (VR/XR/AR).Ashwin Sreenivas is the cofounder and chief technology officer of Helia AI, a computer vision company that structures and understands the world's video. Prior to this, he was a deployment strategist at Palantir Technologies. Ashwin graduated in Phi Beta Kappa from Stanford University with a master's degree in artificial intelligence and a bachelor's degree in computer science.Gabriel Bianconi is the founder of Scalar Research, an artificial intelligence and data science consulting firm. Past clients include start-ups backed by YCombinator and leading venture capital firms (for example, Scale AI, and Fandom), investment firms, and their portfolio companies (for example, the Two Sigma-backed insurance firm MGA), and large enterprises (for example, an industrial conglomerate in Asia, and a leading strategy consulting firm). Beyond consulting, Gabriel is a frequent speaker at major technology conferences and a reviewer on top academic conferences (for example, ICML) and AI textbooks. Previously, he received B.S. and M.S. degrees in computer science from Stanford University, where he conducted award-winning research in computer vision and deep learning.Mani Kanteswara has a bachelor's and a master's in finance (tech) from BITS Pilani with over 10 years of strong technical expertise and statistical knowledge of analytics. He is currently working as a lead strategist with Google and has previously worked as a senior data scientist at WalmartLabs. He has worked in deep learning, computer vision, machine learning, and the natural language processing space building solutions/frameworks capable of solving different business problems and building algorithmic products. He has extensive expertise in solving problems in IoT, telematics, social media, the web, and the e-commerce space. He strongly believes that learning concepts with a practical implementation of the subject and exploring its application areas leads to a great foundation.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents

Title Page

Copyright and Credits

Getting Started with Google BERT

Dedication

About Packt

Why subscribe?

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1 - Starting Off with BERT

A Primer on Transformers

Introduction to the transformer 

Understanding the encoder of the transformer 

Self-attention mechanism 

Understanding the self-attention mechanism 

Step 1

Step 2

Step 3

Step 4

Multi-head attention mechanism 

Learning position with positional encoding 

Feedforward network

Add and norm component 

Putting all the encoder components together 

Understanding the decoder of a transformer

Masked multi-head attention 

Multi-head attention 

Feedforward network

Add and norm component 

Linear and softmax layers

Putting all the decoder components together 

Putting the encoder and decoder together 

Training the transformer

Summary

Questions

Further reading

Understanding the BERT Model

Basic idea of BERT 

Working of BERT 

Configurations of BERT 

BERT-base

BERT-large

Other configurations of BERT

Pre-training the BERT model

Input data representation

Token embedding 

Segment embedding

Position embedding 

Final representation 

WordPiece tokenizer 

Pre-training strategies 

Language modeling

Auto-regressive language modeling 

Auto-encoding language modeling

Masked language modeling

Whole word masking 

Next sentence prediction 

Pre-training procedure 

Subword tokenization algorithms 

Byte pair encoding 

Tokenizing with BPE 

Byte-level byte pair encoding 

WordPiece

Summary

Questions

Further reading

Getting Hands-On with BERT

Exploring the pre-trained BERT model

Extracting embeddings from pre-trained BERT 

Hugging Face transformers 

Generating BERT embeddings

Preprocessing the input 

Getting the embedding 

Extracting embeddings from all encoder layers of BERT

Extracting the embeddings 

Preprocessing the input

Getting the embeddings 

Fine-tuning BERT for downstream tasks

Text classification 

Fine-tuning BERT for sentiment analysis 

Importing the dependencies 

Loading the model and dataset

Preprocessing the dataset

Training the model 

Natural language inference 

Question-answering

Performing question-answering with fine-tuned BERT 

Preprocessing the input

Getting the answer

Named entity recognition 

Summary 

Questions

Further reading 

Section 2 - Exploring BERT Variants

BERT Variants I - ALBERT, RoBERTa, ELECTRA, and SpanBERT

A Lite version of BERT 

Cross-layer parameter sharing 

Factorized embedding parameterization

Training the ALBERT model

Sentence order prediction

Comparing ALBERT with BERT 

Extracting embeddings with ALBERT

Robustly Optimized BERT pre-training Approach

Using dynamic masking instead of static masking 

Removing the NSP task

Training with more data points

Training with a large batch size 

Using BBPE as a tokenizer 

Exploring the RoBERTa tokenizer 

Understanding ELECTRA 

Understanding the replaced token detection task 

Exploring the generator and discriminator of ELECTRA 

Training the ELECTRA model

Exploring efficient training methods

Predicting span with SpanBERT

Understanding the architecture of SpanBERT

Exploring SpanBERT 

Performing Q and As with pre-trained SpanBERT 

Summary

Questions

Further reading 

BERT Variants II - Based on Knowledge Distillation

Introducing knowledge distillation 

Training the student network 

DistilBERT – the distilled version of BERT 

Teacher-student architecture 

The teacher BERT

The student BERT

Training the student BERT (DistilBERT) 

Introducing TinyBERT 

Teacher-student architecture  

Understanding the teacher BERT  

Understanding the student BERT 

Distillation in TinyBERT 

Transformer layer distillation 

Attention-based distillation

Hidden state-based distillation 

Embedding layer distillation 

Prediction layer distillation

The final loss function 

Training the student BERT (TinyBERT)

General distillation 

Task-specific distillation 

The data augmentation method 

Transferring knowledge from BERT to neural networks

Teacher-student architecture 

The teacher BERT 

The student network 

Training the student network  

The data augmentation method

Understanding the masking method

Understanding the POS-guided word replacement method 

Understanding the n-gram sampling method

The data augmentation procedure

Summary

Questions

Further reading 

Section 3 - Applications of BERT

Exploring BERTSUM for Text Summarization

Text summarization 

Extractive summarization

Abstractive summarization 

Fine-tuning BERT for text summarization 

Extractive summarization using BERT 

BERTSUM with a classifier 

BERTSUM with a transformer and LSTM 

BERTSUM with an inter-sentence transformer 

BERTSUM with LSTM 

Abstractive summarization using BERT 

Understanding ROUGE evaluation metrics

Understanding the ROUGE-N metric 

ROUGE-1 

ROUGE-2 

Understanding ROUGE-L  

The performance of the BERTSUM model 

Training the BERTSUM model 

Summary 

Questions

Further reading

Applying BERT to Other Languages

Understanding multilingual BERT 

Evaluating M-BERT on the NLI task 

Zero-shot 

TRANSLATE-TEST 

TRANSLATE-TRAIN

TRANSLATE-TRAIN-ALL

How multilingual is multilingual BERT? 

Effect of vocabulary overlap

Generalization across scripts 

Generalization across typological features 

Effect of language similarity

Effect of code switching and transliteration

Code switching 

Transliteration 

M-BERT on code switching and transliteration 

The cross-lingual language model

Pre-training strategies 

Causal language modeling 

Masked language modeling 

Translation language modeling 

Pre-training the XLM model

Evaluation of XLM

Understanding XLM-R

Language-specific BERT 

FlauBERT for French 

Getting a representation of a French sentence with FlauBERT 

French Language Understanding Evaluation

BETO for Spanish 

Predicting masked words using BETO 

BERTje for Dutch

Next sentence prediction with BERTje

German BERT 

Chinese BERT 

Japanese BERT 

FinBERT for Finnish

UmBERTo for Italian 

BERTimbau for Portuguese 

RuBERT for Russian 

Summary

Questions

Further reading

Exploring Sentence and Domain-Specific BERT

Learning about sentence representation with Sentence-BERT  

Computing sentence representation 

Understanding Sentence-BERT 

Sentence-BERT with a Siamese network 

Sentence-BERT for a sentence pair classification task

Sentence-BERT for a sentence pair regression task

Sentence-BERT with a triplet network

Exploring the sentence-transformers library 

Computing sentence representation using Sentence-BERT 

Computing sentence similarity 

Loading custom models

Finding a similar sentence with Sentence-BERT 

Learning multilingual embeddings through knowledge distillation 

Teacher-student architecture

Using the multilingual model 

Domain-specific BERT 

ClinicalBERT 

Pre-training ClinicalBERT 

Fine-tuning ClinicalBERT 

Extracting clinical word similarity 

BioBERT 

Pre-training the BioBERT model

Fine-tuning the BioBERT model 

BioBERT for NER tasks 

BioBERT for question answering 

Summary 

Questions

Further reading

Working with VideoBERT, BART, and More

Learning language and video representations with VideoBERT 

Pre-training a VideoBERT model  

Cloze task 

Linguistic-visual alignment  

The final pre-training objective 

Data source and preprocessing 

Applications of VideoBERT 

Predicting the next visual tokens

Text-to-video generation 

Video captioning 

Understanding BART 

Architecture of BART 

Noising techniques 

Token masking

Token deletion

Token infilling 

Sentence shuffling 

Document rotation

Comparing different pre-training objectives 

Performing text summarization with BART 

Exploring BERT libraries 

Understanding ktrain

Sentiment analysis using ktrain

Building a document answering model 

Document summarization 

bert-as-service 

Installing the library 

Computing sentence representation

Computing contextual word representation 

Summary 

Questions 

Further reading

Assessments

Chapter 1, A Primer on Transformers

Chapter 2, Understanding the BERT Model

Chapter 3, Getting Hands-On with BERT

Chapter 4, BERT Variants I – ALBERT, RoBERTa, ELECTRA, SpanBERT

Chapter 5, BERT Variants II – Based on Knowledge Distillation

Chapter 6, Exploring BERTSUM for Text Summarization

Chapter 7, Applying BERT to Other Languages

Chapter 8, Exploring Sentence- and Domain-Specific BERT

Chapter 9, Working with VideoBERT, BART, and More

Other Books You May Enjoy

Leave a review - let other readers know what you think

Section 1 - Starting Off with BERT

In this section, we will familiarize ourselves with BERT. First, we will understand how the transformer works, and then we will explore BERT in detail. We will also get hands-on with BERT and learn how to use the pre-trained BERT model.

The following chapters are included in this section:

Chapter 1

, A Primer on Transformers

Chapter 2

, Understanding the BERT Model

Chapter 3

, Getting Hands–On with BERT