E-Book
31,19 €

Getting Started with Google BERT E-Book

Sudharsan Ravichandiran

0,0

31,19 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Lebensstil
Sprache: Englisch

Beschreibung

BERT (bidirectional encoder representations from transformer) has revolutionized the world of natural language processing (NLP) with promising results. This book is an introductory guide that will help you get to grips with Google's BERT architecture. With a detailed explanation of the transformer architecture, this book will help you understand how the transformer’s encoder and decoder work.
You’ll explore the BERT architecture by learning how the BERT model is pre-trained and how to use pre-trained BERT for downstream tasks by fine-tuning it for NLP tasks such as sentiment analysis and text summarization with the Hugging Face transformers library. As you advance, you’ll learn about different variants of BERT such as ALBERT, RoBERTa, and ELECTRA, and look at SpanBERT, which is used for NLP tasks like question answering. You'll also cover simpler and faster BERT variants based on knowledge distillation such as DistilBERT and TinyBERT. The book takes you through MBERT, XLM, and XLM-R in detail and then introduces you to sentence-BERT, which is used for obtaining sentence representation. Finally, you'll discover domain-specific BERT models such as BioBERT and ClinicalBERT, and discover an interesting variant called VideoBERT.
By the end of this BERT book, you’ll be well-versed with using BERT and its variants for performing practical NLP tasks.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 384

Veröffentlichungsjahr: 2021

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Hands-On Reinforcement Learning with Python

Sudharsan Ravichandiran

Hands-On Deep Learning Algorithms with Python

Sudharsan Ravichandiran

Hands-On Meta Learning with Python

Sudharsan Ravichandiran

Trickschule für Katzen

Christine Hauschild

Der große Entwurf

Stephen Hawking

Arschlochpferd - Allein unter Reitern

Nika S. Daveron

Smoothies, Shakes & Powerdrinks

Astrid Büscher

Haare flechten

Abby Smith

Das Gummibärchen Orakel

Gymnastizierende Arbeit an der Hand

Oliver Hilberger

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Richtig nähen mit Overlock- und Coverlock-Maschinen

Christelle Beneytout

Venedig

Leseprobe

Getting Started with Google BERT

Build and train state-of-the-art natural language processing models using BERT

Sudharsan Ravichandiran

BIRMINGHAM - MUMBAI

Getting Started with Google BERT

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Kunal Parikh Publishing Product Manager: Devika BattikeContent Development Editor:Sean LoboSenior Editor: Roshan KumarTechnical Editor:Manikandan KurupCopy Editor:Safis EditingProject Coordinator:Aishwarya MohanProofreader: Safis EditingIndexer:Priyanka DhadkeProduction Designer: Prashant Ghare

First published: January 2021

Production reference: 1210121

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-83882-159-3

www.packt.com

To my adorable mom, Kasthuri, and to my beloved dad, Ravichandiran.

- Sudharsan Ravichandiran

Packt.com

Subscribe to our online digital library for full access to over 7,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website.

Why subscribe?

Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals

Improve your learning with Skill Plans built especially for you

Get a free eBook or video every month

Fully searchable for easy access to vital information

Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

About the author

Sudharsan Ravichandiran is a data scientist, researcher, and bestselling author. He completed his bachelor's in information technology at Anna University. His area of research focuses on practical implementations of deep learning and reinforcement learning, including natural language processing and computer vision. He is an open source contributor and loves answering questions on Stack Overflow. He also authored a best seller, Hands-On Reinforcement Learning with Python, published by Packt Publishing.

I would like to thank my most amazing parents and my brother, Karthikeyan, for inspiring and motivating me. I would like to thank the Packt team, Devika, Sean, and Kirti, for their great help. Without all of their support, it would have been impossible to complete this book.

About the reviewers

Dr. ArmandoFandango creates AI-empowered products by leveraging reinforcement learning, deep learning, and distributed computing. Armando has provided thought leadership in diverse roles at small and large enterprises, including Accenture, Nike, Sonobi, and IBM, along with advising high-tech AI-based start-ups. Armando has authored several books, including Mastering TensorFlow, TensorFlow Machine Learning Projects, and Python Data Analysis, and has published research in international journals and presented his research at conferences. Dr. Armando’s current research and product development interests lie in the areas of reinforcement learning, deep learning, edge AI, and AI in simulated and real environments (VR/XR/AR).Ashwin Sreenivas is the cofounder and chief technology officer of Helia AI, a computer vision company that structures and understands the world's video. Prior to this, he was a deployment strategist at Palantir Technologies. Ashwin graduated in Phi Beta Kappa from Stanford University with a master's degree in artificial intelligence and a bachelor's degree in computer science.Gabriel Bianconi is the founder of Scalar Research, an artificial intelligence and data science consulting firm. Past clients include start-ups backed by YCombinator and leading venture capital firms (for example, Scale AI, and Fandom), investment firms, and their portfolio companies (for example, the Two Sigma-backed insurance firm MGA), and large enterprises (for example, an industrial conglomerate in Asia, and a leading strategy consulting firm). Beyond consulting, Gabriel is a frequent speaker at major technology conferences and a reviewer on top academic conferences (for example, ICML) and AI textbooks. Previously, he received B.S. and M.S. degrees in computer science from Stanford University, where he conducted award-winning research in computer vision and deep learning.Mani Kanteswara has a bachelor's and a master's in finance (tech) from BITS Pilani with over 10 years of strong technical expertise and statistical knowledge of analytics. He is currently working as a lead strategist with Google and has previously worked as a senior data scientist at WalmartLabs. He has worked in deep learning, computer vision, machine learning, and the natural language processing space building solutions/frameworks capable of solving different business problems and building algorithmic products. He has extensive expertise in solving problems in IoT, telematics, social media, the web, and the e-commerce space. He strongly believes that learning concepts with a practical implementation of the subject and exploring its application areas leads to a great foundation.

Packt is searching for authors like you

If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Title Page

Getting Started with Google BERT

Dedication

About Packt

Why subscribe?

Contributors

About the author

About the reviewers

Packt is searching for authors like you

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Reviews

Section 1 - Starting Off with BERT

A Primer on Transformers

Introduction to the transformer

Understanding the encoder of the transformer

Self-attention mechanism

Understanding the self-attention mechanism

Step 1

Step 2

Step 3

Step 4

Multi-head attention mechanism

Learning position with positional encoding

Feedforward network

Add and norm component

Putting all the encoder components together

Understanding the decoder of a transformer

Masked multi-head attention

Multi-head attention

Feedforward network

Add and norm component

Linear and softmax layers

Putting all the decoder components together

Putting the encoder and decoder together

Training the transformer

Summary

Questions

Further reading

Understanding the BERT Model

Basic idea of BERT

Working of BERT

Configurations of BERT

BERT-base

BERT-large

Other configurations of BERT

Pre-training the BERT model

Input data representation

Token embedding

Segment embedding

Position embedding

Final representation

WordPiece tokenizer

Pre-training strategies

Language modeling

Auto-regressive language modeling

Auto-encoding language modeling

Masked language modeling

Whole word masking

Next sentence prediction

Pre-training procedure

Subword tokenization algorithms

Byte pair encoding

Tokenizing with BPE

Byte-level byte pair encoding

WordPiece

Summary

Questions

Further reading

Getting Hands-On with BERT

Exploring the pre-trained BERT model

Extracting embeddings from pre-trained BERT

Hugging Face transformers

Generating BERT embeddings

Preprocessing the input

Getting the embedding

Extracting embeddings from all encoder layers of BERT

Extracting the embeddings

Preprocessing the input

Getting the embeddings

Fine-tuning BERT for downstream tasks

Text classification

Fine-tuning BERT for sentiment analysis

Importing the dependencies

Loading the model and dataset

Preprocessing the dataset

Training the model

Natural language inference

Question-answering

Performing question-answering with fine-tuned BERT

Preprocessing the input

Getting the answer

Named entity recognition

Summary

Questions

Further reading

Section 2 - Exploring BERT Variants

BERT Variants I - ALBERT, RoBERTa, ELECTRA, and SpanBERT

A Lite version of BERT

Cross-layer parameter sharing

Factorized embedding parameterization

Training the ALBERT model

Sentence order prediction

Comparing ALBERT with BERT

Extracting embeddings with ALBERT

Robustly Optimized BERT pre-training Approach

Using dynamic masking instead of static masking

Removing the NSP task

Training with more data points

Training with a large batch size

Using BBPE as a tokenizer

Exploring the RoBERTa tokenizer

Understanding ELECTRA

Understanding the replaced token detection task

Exploring the generator and discriminator of ELECTRA

Training the ELECTRA model

Exploring efficient training methods

Predicting span with SpanBERT

Understanding the architecture of SpanBERT

Exploring SpanBERT

Performing Q and As with pre-trained SpanBERT

Summary

Questions

Further reading

BERT Variants II - Based on Knowledge Distillation

Introducing knowledge distillation

Training the student network

DistilBERT – the distilled version of BERT

Teacher-student architecture

The teacher BERT

The student BERT

Training the student BERT (DistilBERT)

Introducing TinyBERT

Teacher-student architecture

Understanding the teacher BERT

Understanding the student BERT

Distillation in TinyBERT

Transformer layer distillation

Attention-based distillation

Hidden state-based distillation

Embedding layer distillation

Prediction layer distillation

The final loss function

Training the student BERT (TinyBERT)

General distillation

Task-specific distillation

The data augmentation method

Transferring knowledge from BERT to neural networks

Teacher-student architecture

The teacher BERT

The student network

Training the student network

The data augmentation method

Understanding the masking method

Understanding the POS-guided word replacement method

Understanding the n-gram sampling method

The data augmentation procedure

Summary

Questions

Further reading

Section 3 - Applications of BERT

Exploring BERTSUM for Text Summarization

Text summarization

Extractive summarization

Abstractive summarization

Fine-tuning BERT for text summarization

Extractive summarization using BERT

BERTSUM with a classifier

BERTSUM with a transformer and LSTM

BERTSUM with an inter-sentence transformer

BERTSUM with LSTM

Abstractive summarization using BERT

Understanding ROUGE evaluation metrics

Understanding the ROUGE-N metric

ROUGE-1

ROUGE-2

Understanding ROUGE-L

The performance of the BERTSUM model

Training the BERTSUM model

Summary

Questions

Further reading

Applying BERT to Other Languages

Understanding multilingual BERT

Evaluating M-BERT on the NLI task

Zero-shot

TRANSLATE-TEST

TRANSLATE-TRAIN

TRANSLATE-TRAIN-ALL

How multilingual is multilingual BERT?

Effect of vocabulary overlap

Generalization across scripts

Generalization across typological features

Effect of language similarity

Effect of code switching and transliteration

Code switching

Transliteration

M-BERT on code switching and transliteration

The cross-lingual language model

Pre-training strategies

Causal language modeling

Masked language modeling

Translation language modeling

Pre-training the XLM model

Evaluation of XLM

Understanding XLM-R

Language-specific BERT

FlauBERT for French

Getting a representation of a French sentence with FlauBERT

French Language Understanding Evaluation

BETO for Spanish

Predicting masked words using BETO

BERTje for Dutch

Next sentence prediction with BERTje

German BERT

Chinese BERT

Japanese BERT

FinBERT for Finnish

UmBERTo for Italian

BERTimbau for Portuguese

RuBERT for Russian

Summary

Questions

Further reading

Exploring Sentence and Domain-Specific BERT

Learning about sentence representation with Sentence-BERT

Computing sentence representation

Understanding Sentence-BERT

Sentence-BERT with a Siamese network

Sentence-BERT for a sentence pair classification task

Sentence-BERT for a sentence pair regression task

Sentence-BERT with a triplet network

Exploring the sentence-transformers library

Computing sentence representation using Sentence-BERT

Computing sentence similarity

Loading custom models

Finding a similar sentence with Sentence-BERT

Learning multilingual embeddings through knowledge distillation

Teacher-student architecture

Using the multilingual model

Domain-specific BERT

ClinicalBERT

Pre-training ClinicalBERT

Fine-tuning ClinicalBERT

Extracting clinical word similarity

BioBERT

Pre-training the BioBERT model

Fine-tuning the BioBERT model

BioBERT for NER tasks

BioBERT for question answering

Summary

Questions

Further reading

Working with VideoBERT, BART, and More

Learning language and video representations with VideoBERT

Pre-training a VideoBERT model

Cloze task

Linguistic-visual alignment

The final pre-training objective

Data source and preprocessing

Applications of VideoBERT

Predicting the next visual tokens

Text-to-video generation

Video captioning

Understanding BART

Architecture of BART

Noising techniques

Token masking

Token deletion

Token infilling

Sentence shuffling

Document rotation

Comparing different pre-training objectives

Performing text summarization with BART

Exploring BERT libraries

Understanding ktrain

Sentiment analysis using ktrain

Building a document answering model

Document summarization

bert-as-service

Installing the library

Computing sentence representation

Computing contextual word representation

Summary

Questions

Further reading

Assessments

Chapter 1, A Primer on Transformers

Chapter 2, Understanding the BERT Model

Chapter 3, Getting Hands-On with BERT

Chapter 4, BERT Variants I – ALBERT, RoBERTa, ELECTRA, SpanBERT

Chapter 5, BERT Variants II – Based on Knowledge Distillation

Chapter 6, Exploring BERTSUM for Text Summarization

Chapter 7, Applying BERT to Other Languages

Chapter 8, Exploring Sentence- and Domain-Specific BERT

Chapter 9, Working with VideoBERT, BART, and More

Other Books You May Enjoy

Leave a review - let other readers know what you think

Section 1 - Starting Off with BERT

In this section, we will familiarize ourselves with BERT. First, we will understand how the transformer works, and then we will explore BERT in detail. We will also get hands-on with BERT and learn how to use the pre-trained BERT model.

The following chapters are included in this section:

Chapter 1

, A Primer on Transformers

Chapter 2

, Understanding the BERT Model

Chapter 3

, Getting Hands–On with BERT