There's more...

You thought I'd let you get out of this chapter without understanding the data itself? Let's check in at this page in order to find out some extra details about the data we just analyzed:

https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names.

This file basically describes the basic details of the data. As you'll see when you look through this file, our mapping that we produce with this code actually matches the representation shown in this file. And that's the point! If we have done our job correctly, we should be able to create a mapping in code that allows us to freely manipulate the data to produce models while maintaining the class labels, for instance.

You've learned about a new library, pandas—it's common for usage with deep learning and data science. More details on the pandas library can be found here:

http://pandas.pydata.org/pandas-docs/stable/.

Along with pandas, we also learned about the need for understanding one-hot encoding and categorical variables. There are algorithms that can handle categorical variables out of the box but the majority of algorithms that you will be exposed to will simply need an encoding applied to the data. Here's a few more details on a different method for doing one-hot encoding:

http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.135.36