Preface

Authors are a quirky lot; almost like the weather in London. The sky is overcast, you want to go for a walk in Trafalgar Square, you wear your raincoat, pick up your umbrella just in case, and you think you are ready for anything. But you are woefully unaware of the sinister plan nature has for you. You walk a mile or so, and suddenly, without warning, the sky clears, the sun pours its brightest song upon your face, and lo and behold, you are caught unaware (like a deer in headlights) with your raincoat and umbrella and you are too far from home to go back and get rid of them. This is exactly what happens to the best of us when we set out to write a book. You set out with a clear objective, focus your thoughts, write a fantastic outline, get it approved, and start formulating your chapters, but unbeknown to you, the book has other plans on how it wants to write itself.

When this happens, as in life, there are always choices. You can let the creative stream express itself through your hands onto the pages of the book, or you can resist and follow the preconceived pattern you laid out. There is, of course, also a third choice, which is to follow the overall structure for what you want to convey, but allow creativity to take control when it wants to. This is what we did for this book. But it was not as easy as we thought at first, because creativity doesn't take no for an answer. The famous Sufi poet Jalaluddin Rumi said: "In silence, there is eloquence. Stop weaving and see how the pattern improves." The most difficult part was to stop "weaving" or to stop being inspired by the content that we had already published as AWS authors. This was also a hard requirement for the book, and so it was a strong motivation for us to be creative and come up with original, in-demand, and fresh content for the book.

So, we stopped "weaving." The next logical step was for the pattern to improve. But nothing happened. The deadline for the first chapter was looming, and our editors were very politely reminding us of the due date. Still nada. We used this "no weaving" time to storyboard and architect the technical chapters, but the glue that was to hold together the book, the main narrative, continued to elude us. And then suddenly, one day, without warning it struck. We had totally missed the important first part of Rumi's saying: "In silence, there is eloquence." A walk in nature at a trail nearby took care of the daily quota of silence, during which time a faint thought appeared, a memory of a story that my father (Shri T. Rangarajan) had narrated to me when I was a kid called Ali Baba and the Forty Thieves. It dawned on me that the famous sequence from the story was in fact my first recollection of using voice to perform a task (please refer to Chapter 1, NLP in the Business Context and Introduction to AWS AI Services, in the book). And from then on, the floodgates opened. They never stopped until the book was written in its entirety. And that is how this book came about.

An interesting fact about life we all know is that change is the only constant thing. And this was true when writing this book as well. One of the best things about AWS is the pace of innovation with which new features are introduced. The AWS product roadmap is based on direct customer feedback and features are improved iteratively with new features launched continuously. So, as we were writing this book, Amazon Comprehend and Amazon Textract added new features, the console experience was changed, and so on. For example, Amazon Comprehend modified its console experience, added support for custom entity recognition training from PDF documents directly, and improved its custom entity recognition model framework to support training with just 100 annotations per entity and 250 documents. Amazon Textract reduced pricing by 32% for the AnalyzeDocument and DetectDocumentText APIs in eight global AWS Regions, announced support for the automated processing of invoices, and so on. A full list of what's new in AWS in 2021 can be reviewed at this link: https://aws.amazon.com/about-aws/whats-new/2021/.

You will notice these changes as you build the solutions for the various NLP use cases in this book. Please note that since the Amazon Textract and Amazon Comprehend consoles have changed, the instructions in the book may not be a word-for-word match with your experience in the AWS Management Console; however, they are accurate and adequate for your needs.

For example, the Train Recognizer button in the Amazon Comprehend console for custom entity recognition has now changed to Create new model. Similarly, Train Classifier in the Amazon Comprehend console for custom classification has now also changed to Create new model. When you specify Training and test dataset for custom entity recognition, a new option will now appear in the console for selecting PDF, Word documents. Amazon Textract has changed and it now reflects AnalyzeExpense as an option to view the results for your document in the console.

In the majority of the book however we have used APIs to build the solutions and the best thing about AWS is that the APIs do not change. You get consistent responses and requests. You just need to upgrade the version of Python Boto3 if you want to use the latest one. Moreover, our goal is to make sure this book remains relevant and up to date.

Who this book is for

If you're an NLP developer or data scientist looking to get started with AWS AI services to implement various NLP scenarios quickly, this book is for you. It will show you how easy it is to integrate AI in applications with just a few lines of code. A basic understanding of machine learning concepts is necessary to understand the concepts covered. Experience with Jupyter Notebooks and Python will be helpful.

What this book covers

Chapter 1, NLP in the Business Context and Introduction to AWS AI Services, introduces the NLP construct and the business value of using NLP, leading to an overview of the AWS AI stack along with the key NLP services.

Chapter 2, Introducing Amazon Textract, provides a detailed introduction to Amazon Textract, what its functions are, what business challenges it was created to solve, what features it has, what types of user requirements it can be applied to, and how easy it is to integrate Textract with other AWS services, such as AWS Lambda for building business applications.

Chapter 3, Introducing Amazon Comprehend, provides a detailed introduction to Amazon Comprehend, what its functions are, what business challenges it was created to solve, what features it has, what types of user requirements it can be applied to, and how easy it is to integrate Comprehend with other AWS services, such as AWS Lambda for building business applications.

Chapter 4, Automating Document Processing Workflows, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step-by-step guide on how to set up and run these examples along with access to the GitHub repository.

Chapter 5, Creating NLP Search, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step-by-step guide on how to set up and run these examples along with access to the GitHub repository.

Chapter 6, Using NLP to Improve Customer Service Efficiency, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 7, Understanding the Voice of Your Customer Analytics, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 8, Leveraging NLP to Monetize Your Media Content, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 9, Extracting Metadata from Financial Documents, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 10, Reducing Localization Costs with Machine Translation, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 11, Using Chatbots for Querying Documents, dives deep into the several types of use cases prevalent across industries that can benefit from NLP based on our collective experience and the usage trends we have observed. We will provide detailed code samples, a design and development approach, and a step by step guide on how to set up and run these examples along with access to the Github repository.

Chapter 12, AI and NLP in Healthcare, dives deep into the use case of how AWS NLP solutions can help achieve operational efficiency in healthcare with an automated claims adjunction use case.

Chapter 13, Improving the Accuracy of Document Processing Workflows, talks about why we need humans in the loop (HITLs) in document processing workflows, and how setting up HITL processes with Amazon Augmented AI (A2I) can help improve the accuracy of your existing document processing workflows with Amazon Textract.

Chapter 14, Auditing Named Entity Recognition Workflows, walks through an extension of the previous approach by including Amazon Comprehend for text-based insights, thereby demonstrating an end-to-end process for setting up an auditing workflow for your custom named entity recognition use cases.

Chapter 15, Classifying Documents and Setting up Human in the Loop for Active Learning, talks about how you can use Amazon Comprehend custom classification to classify documents and then how you can set up active learning feedback with your custom classification model using Amazon A2I.

Chapter 16, Improving the Accuracy of PDF Batch Processing, tackles an operational need that has been around for a while and is ubiquitous, and yet organizations struggle to address it efficiently – known as PDF batch processing.

Chapter 17, Visualizing Insights from Handwritten Content, is all about how to visualize insights from text – that is, handwritten text – and make use of it to drive decision-making.

Chapter 18, Building Secure, Reliable, and Efficient NLP Solutions, reviews the best practices, techniques, and guidance on what makes a good NLP solution great.

To get the most out of this book

You will need access to an AWS account, so before getting started, we recommend that you create one.

If you are using the digital version of this book, we advise you to type the code yourself. Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Natural-Language-Processing-with-AWS-AI-Services. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781801812535_ColorImages.pdf.

Code in Action

The Code in Action videos for this book can be viewed at https://bit.ly/3vPvDkj.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Copy the created bucket name, open Chapter 05/Ch05-Kendra Search.ipynb, and paste it in the following cell in place of '<your s3 bucket name>' to get started."

A block of code is set as follows:

# Define IAM role

role = get_execution_role()

print("RoleArn: {}".format(role))

sess = sagemaker.Session()

s3BucketName = '<your s3 bucket name>'

prefix = 'chapter5'

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

    <body>

        <h1>Family Bank Holdings</h1>

        <h3>Date: <span id="date"></span></h3>

        <div id="home">

          <div id="hometext">

        <h2>Who we are and what we do</h2>

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "You will see that the page has a few headings and then a paragraph talking about Family Bank, a subsidiary of LiveRight Holdings."

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you've read Natural Language Processing with AWS AI Services, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.61.73