Use pandas to read text files in Jupyter

The most common type of text file that will have analysis data is a CSV file. There are a large variety of datasets available on the internet in this format. We will look at the Titanic survivor data found at https://vincentarelbundock.github.io/Rdatasets/csv/datasets/Titanic.csv.

Like most of the pandas, the function call is very easy to use:

import pandas as pd
df = pd.read_csv ('https://vincentarelbundock.github.io/Rdatasets/csv/datasets/Titanic.csv')
print (df.head)  

However, again like many pandas, there is an extensive set of optional parameters that could be passed into the read_csv function, that are defaulted to the most commonly used features so we can write small code like used previously to get our work done. Some of the additional parameters we could use allow us to:

  • Skip rows
  • Skip/define column headings
  • And change index field(s) (Python always wants to keep a main indexing field within a data frame to speed access)

The resultant script execution under Jupyter is shown in the following screenshot. (Note, I am only printing the first and last 30 rows of the table using the head function):

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.147.215