Naive Bayes (nbBag) bagging implementation

We will now do the nbBag implementation by executing the following code:

# setting up parameters to build svm bagging model 
bagctrl <- bagControl(fit = nbBag$fit,
predict = nbBag$pred ,
aggregate = nbBag$aggregate)
# fit the bagged nb model
set.seed(300)
nbbag <- train(Attrition ~ ., data = mydata, method="bag", trControl = cvcontrol, bagControl = bagctrl)
# printing the model results
nbbag

This will result in the following output:

Bagged Model  

1470 samples
30 predictors
2 classes: 'No', 'Yes'

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 1324, 1324, 1323, 1323, 1323, 1323, ...
Resampling results:

Accuracy Kappa
0.8389878 0.00206872

Tuning parameter 'vars' was held constant at a value of 44

We see that in this case, we achieved only 83.89% accuracy, which is slightly inferior to the KNN model's performance of 84%. 

Although we have shown only three examples of the caret methods for bagging, the code remains the same to implement the other methods. The only change that is needed in the code is to replace the fit, predict, and aggregate parameters in bagControl. For example, to implement bagging with a neural network algorithm, we need to define bagControl as follows:

bagControl(fit = nnetBag$fit, predict = nnetBag$pred , aggregate = nnetBag$aggregate) 

It may be noted that an appropriate library needs to be available in R for caret to run the methods, otherwise it results in error. For example, nbBag requires the klaR library to be installed on the system prior to executing the code. Similarly, the ctreebag function needs the party package to be installed. Users need to check the availability of an appropriate library on the system prior to including it for use with the caret bagging.

We now have an understanding of implementing a project through bagging technique. The next subsection covers the underlying working mechanism of bagging. This will help get clarity in terms of what bagging did internally with our dataset so as to produce better performance measurements than that of stand-alone model performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.35.193