Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Tokenization

We can take the text and tokenize it as per our requirements. The first step is to generate different sentences from a text. This can be done easily using this library:

text.sentences

It should list the set of output, such as the following:

[Sentence("Twitter is one of the most important social media used in today's world."),
 Sentence("It provides the platform to share people's opinions, facts and information regarding person, place, animals or things."),
 Sentence("These tweets are used by several private, governmental and non-governmental organizations to mine different types of information including business intelligence.")]

Not only that, we can list out words from any text:

text.words

Run the preceding snippet and study the output you get from that. Say you want to see the frequency of any particular word. You can do the following:

text.word_counts['twitter']
1

It should display the frequency of the words in the text.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

18.191.211.66

Table of Contents for Tokenization

Create new playlist

Sign In

Sign Up

Table of Contents for
Tokenization