Loading an example dataset

The scikits-learn project comes with a number of datasets and sample images with which we can experiment. In this recipe, we will load an example dataset, that is included with the scikits-learn distribution. The datasets hold data as a NumPy, two-dimensional array and metadata linked to the data.

How to do it...

We will load a sample data set of the Boston house prices. It is a tiny dataset, so if you are looking for a house in Boston, don't get too excited. There are more datasets as described in http://scikit-learn.org/dev/modules/classes.html#module-sklearn.datasets.

We will look at the shape of the raw data, and its maximum and minimum value. The shape is a tuple , representing the dimensions of the NumPy array. We will do the same for the target array, which contains values that are the learning objectives. The following code accomplishes our goals:

from sklearn import datasets

boston_prices = datasets.load_boston()
print "Data shape", boston_prices.data.shape
print "Data max=%s min=%s" % (boston_prices.data.max(), boston_prices.data.min())
print "Target shape", boston_prices.target.shape
print "Target max=%s min=%s" % (boston_prices.target.max(), boston_prices.target.min())

And the outcome of our program is as follows:

Data shape (506, 13)
Data max=711.0 min=0.0
Target shape (506,)
Target max=50.0 min=5.0
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.5.12