Using a linear model on extracted RBM components

Even the optimal number of PCA components was unable to beat the logistic regression alone by much in terms of accuracy. Let's see how our RBM does. To make the following pipeline, we will keep the same parameters for the logistic regression model and find the optimal number of components between 10, 100, and 200 (like we did for the PCA pipeline). Note that we could try to expand the number of features past the number of raw pixels (784) but we will not attempt to.

We begin the same way by setting up our variables:

# Use the RBM to learn new features

rbm = BernoulliRBM(random_state=0)

# set up the params for our pipeline.
params = {'clf__C':[1e-1, 1e0, 1e1],
'rbm__n_components': [10, 100, 200]
}

# create our pipeline
pipeline = Pipeline([('rbm', rbm), ('clf', lr)])

# instantiate a gridsearch class
grid = GridSearchCV(pipeline, params)

Fitting this grid search to our raw pixels will reveal the optimal number of components:

# fit to our data
grid.fit(images_X, images_y)

# check the best params
grid.best_params_, grid.best_score_

({'clf__C': 1.0, 'rbm__n_components': 200}, 0.91766666666666663)

Our RBM module, with a 91.75% cross-validated accuracy, was able to extract 200 new features from our digits and give us a boost of three percent in accuracy (which is a lot!) by not doing anything other than adding the BernoulliRBM module into our pipeline.

The fact that 200 was the optimal number of components suggests that we may even obtain a higher performance by trying to extract more than 200 components. We will leave this as an exercise to the reader.

This is evidence to the fact that feature learning algorithms work very well when dealing with very complex tasks such as image recognition, audio processing, and natural language processing. These large and interesting datasets have hidden components that are difficult for linear transformations like PCA or LDA to extract but non-parametric algorithms like RBM can.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.55.35