Using pandas vectorized string functions

For string formatting, it would be better to use a dataset that's a little messier. We will use the dataset that I collected during my Ph.D. research study when writing a review paper. It can be found here: https://raw.githubusercontent.com/sureshHARDIYA/phd-resources/master/Data/Review%20Paper/preprocessed.csv

  1. Let's load this text article and then display the first eight entries. Let's start by loading the data and checking its structure and a few of the comments, as follows:
import numpy as np
import pandas as pd
import os

  1. Next, let's read the text file and display the last 10 items, as follows:
text = pd.read_csv("https://raw.githubusercontent.com/sureshHARDIYA/phd-resources/master/Data/Review%20Paper/preprocessed.csv")
text = text["TITLE"]
print (text.shape)
print( text.tail(10))
  1. The output of the preceding code can be seen in the following screenshot:

Figure 1: This is the output of the preceding code

Pandas extends built-in functions that operate on an entire series of strings. In the next section, we are going to use the same dataset with pandas string functions. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.172.38