Execute the following steps to train the advanced classifiers.
- Import the libraries:
from sklearn.ensemble import (RandomForestClassifier,
GradientBoostingClassifier)
from xgboost.sklearn import XGBClassifier
from lightgbm import LGBMClassifier
- Define and fit the Random Forest pipeline:
rf = RandomForestClassifier(random_state=42) rf_pipeline = Pipeline(steps=[('preprocessor', preprocessor), ('classifier', rf) ])
rf_pipeline.fit(X_train, y_train)
rf_perf = performance_evaluation_report(rf_pipeline, X_test,
y_test, labels=LABELS,
show_plot=True,
show_pr_curve=True)
The performance of the Random Forest can be summarized by the following plot:
- Define and fit the Gradient Boosted Trees pipeline:
gbt = GradientBoostingClassifier(random_state=42) gbt_pipeline = Pipeline(steps=[('preprocessor', preprocessor), ('classifier', gbt) ])
gbt_pipeline.fit(X_train, y_train)
gbt_perf = performance_evaluation_report(gbt_pipeline, X_test,
y_test, labels=LABELS,
show_plot=True,
show_pr_curve=True)
The performance of the Gradient Boosted Trees can be summarized by the following plot:
- Define and fit an XGBoost pipeline:
xgb = XGBClassifier(random_state=42) xgb_pipeline = Pipeline(steps=[('preprocessor', preprocessor), ('classifier', xgb) ])
xgb_pipeline.fit(X_train, y_train) xgb_perf = performance_evaluation_report(xgb_pipeline, X_test,
y_test, labels=LABELS,
show_plot=True,
show_pr_curve=True)
The performance of the XGBoost can be summarized by the following plot:
- Define and fit the LightGBM pipeline:
lgbm = LGBMClassifier(random_state=42)
lgbm_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
('classifier', lgbm)
])
lgbm_pipeline.fit(X_train, y_train)
lgbm_perf = performance_evaluation_report(lgbm_pipeline, X_test,
y_test, labels=LABELS,
show_plot=True,
show_pr_curve=True)
The performance of the LightGBM can be summarized by the following plot:
From the reports, it looks like the shape of the ROC Curve and the Precision-Recall Curve was very similar for all the models.