How to do it..

  1. Read in the Texas cities dataset, and identify the variables:
>>> cities = pd.read_csv('data/texas_cities.csv')
>>> cities
  1. The City column looks good and contains exactly one value. The Geolocation column, on the other hand, contains four variables: latitude, latitude direction, longitude, and longitude direction. Let's split the Geolocation column into four separate columns:
>>> geolocations = cities.Geolocation.str.split(pat='. ',
expand=True)
>>> geolocations.columns = ['latitude', 'latitude direction',
'longitude', 'longitude direction']
>>> geolocations
  1. Because the original data type for the Geolocation was an object, all the new columns are also objects. Let's change latitude and longitude into floats:
>>> geolocations = geolocations.astype({'latitude':'float',
'longitude':'float'})
>>> geolocations.dtypes
latitude float64 latitude direction object longitude float64 longitude direction object dtype: object
  1. Concatenate these new columns with the City column from the original:
>>> cities_tidy = pd.concat([cities['City'], geolocations],
axis='columns')
>>> cities_tidy
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.12.50