We can take the text and tokenize it as per our requirements. The first step is to generate different sentences from a text. This can be done easily using this library:
text.sentences
It should list the set of output, such as the following:
[Sentence("Twitter is one of the most important social media used in today's world."), Sentence("It provides the platform to share people's opinions, facts and information regarding person, place, animals or things."), Sentence("These tweets are used by several private, governmental and non-governmental organizations to mine different types of information including business intelligence.")]
Not only that, we can list out words from any text:
text.words
Run the preceding snippet and study the output you get from that. Say you want to see the frequency of any particular word. You can do the following:
text.word_counts['twitter']
1
It should display the frequency of the words in the text.