8.1 Introduction

We’ve introduced strings, basic string formatting and several string operators and methods. You saw that strings support many of the same sequence operations as lists and tuples, and that strings, like tuples, are immutable. Now, we take a deeper look at strings and introduce regular expressions and the re module, which we’ll use to match patterns1 in text. Regular expressions are particularly important in today’s data rich applications. The capabilities you learn here will help you prepare for the “Natural Language Processing (NLP)” chapter and other key data science chapters. In the NLP chapter, we’ll look at other ways to have computers manipulate and even “understand” text. The table below shows many string-processing and NLP-related applications. In the Intro to Data Science section, we briefly introduce data cleaning/munging/wrangling with Pandas Series and DataFrames.

String and NLP applications
Anagrams
Automated grading of written homework
Automated teaching systems
Categorizing articles
Chatbots
Compilers and interpreters
Creative writing
Cryptography
Document classification
Document similarity
Document summarization
Electronic book readers
Fraud detection
Grammar checkers
Inter-language translation
Legal document preparation
Monitoring social media posts
Natural language understanding
Opinion analysis
Page-composition software
Palindromes
Parts-of-speech tagging
Project Gutenberg free books
Reading books, articles, documentation and absorbing knowledge
Search engines
Sentiment analysis
Spam classification
Speech-to-text engines
Spell checkers
Steganography
Text editors
Text-to-speech engines
Web scraping
Who authored Shakespeare’s works?
Word clouds
Word games
Writing medical diagnoses from x-rays, scans, blood tests
and many more…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.168.56