Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Transformers for Natural Language Processing

Second Edition

Build, train, and fine-tune deep neural network architectures for NLP with Python, Hugging Face, and OpenAI’s GPT-3, ChatGPT, and GPT-4

Denis Rothman

Transformers for Natural Language Processing

Second Edition

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Producer: Tushar Gupta

Acquisition Editor – Peer Reviews: Saby Dsilva

Project Editor: Janice Gonsalves

Content Development Editor: Bhavesh Amin

Copy Editor: Safis Editing

Technical Editor: Karan Sonawane

Proofreader: Safis Editing

Indexer: Pratik Shirodkar

Presentation Designer: Pranit Padwal

First published: January 2021

Second edition: March 2022

Production reference: 5270423

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80324-733-5

www.packt.com

Foreword

In less than four years, Transformers took the NLP community by storm, breaking any record achieved in the previous 30 years. Models such as BERT, T5, and GPT, now constitute the fundamental building bricks for new applications in everything from computer vision to speech recognition to translation to protein sequencing to writing code. For this reason, Stanford has recently introduced the term foundation models to define a set of large language models based on giant pre-trained transformers. All of this progress is thanks to a few simple ideas.

This book is a reference for everyone interested in understanding how transformers work both from a theoretical and from a practical perspective. The author does a tremendous job of explaining how to use transformers step-by-step with a hands-on approach. After reading this book, you will be ready to use this state-of-the-art set of techniques for empowering your deep learning applications. In Particular, this book gives a solid background on the architecture of transformers before covering, in detail, popular models, such as BERT, RoBERTa, T5, and GPT-3. It also explains many use cases (text summarization, image labeling, question-answering, sentiment analysis, and fake news analysis) that transformers can cover.

If these topics interest you, then this is definitely a worthwhile book. The first edition always has a place on my desk, and the same is going to happen with the second edition.

Antonio Gulli

Engineering Director for the Office of the CTO, Google

Contributors

About the author

Denis Rothman graduated from Sorbonne University and Paris Diderot University, designing one of the first patented encoding and embedding systems. He authored one of the first patented AI cognitive robots and bots. He began his career delivering Natural Language Processing (NLP) chatbots for Moët et Chandon and an AI tactical defense optimizer for Airbus (formerly Aerospatiale). Denis then authored an AI resource optimizer for IBM and luxury brands, leading to an Advanced Planning and Scheduling (APS) solution used worldwide

I want to thank the corporations that trusted me from the start to deliver artificial intelligence solutions and shared the risks of continuous innovation. I also want to thank my family, who always believed I would make it.

About the reviewer

George Mihaila is a Ph.D. candidate at the University of North Texas in the Department of Computer Science, where he also got his master’s degree in computer science. He received his bachelor’s degree in electrical engineering in his home country, Romania.

He worked for 10 months at TCF Bank, where he helped put together the machine learning operation framework for automatic model deployment and monitoring. He did three internships for State Farm as a data scientist and machine learning engineer. He worked as a data scientist and machine learning engineer for the University of North Texas’ High-Performance Computing Center for 2 years. He has been working in the research field of natural language processing for 5 years, with the last 3 years spent working with transformer models. His research interests are in dialogue generation with persona.

He was a technical reviewer for the first edition of Transformers for Natural Language Processing by Denis Rothman.

He is currently working toward his doctoral thesis in casual dialog generation with persona.

In his free time, George likes to share his knowledge of state-of-the-art language models with tutorials and articles and help other researchers in the field of NLP.

Join our book’s Discord space

Join the book’s Discord workspace:

https://www.packt.link/Transformers

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Transformers for Natural Language Processing - Second Edition

Create new playlist

Sign In

Sign Up

Foreword

Contributors

About the author

About the reviewer

Join our book’s Discord space

Table of Contents for
Transformers for Natural Language Processing - Second Edition