The GridSearchCV result stores the average cross-validation scores so that we can analyze how different hyperparameter settings affect the outcome.
The six seaborn swarm plots in the left-hand panel of the below chart show the distribution of AUC test scores for all parameter values. In this case, the highest AUC test scores required a low learning_rate and a large value for max_features. Some parameter settings, such as a low learning_rate, produce a wide range of outcomes that depend on the complementary settings of other parameters. Other parameters are compatible with high scores for all settings use in the experiment:
We will now explore how hyperparameter settings jointly affect the mean cross-validation score. To gain insight into how parameter settings interact, we can train a DecisionTreeRegressor with the mean test score as the outcome and the parameter settings, encoded as categorical variables in one-hot or dummy format (see the notebook for details). The tree structure highlights that using all features (max_features_1), a low learning_rate, and a max_depth over three led to the best results, as shown in the following diagram:
The bar chart in the right-hand panel of the first chart in this section displays the influence of the hyperparameter settings in producing different outcomes, measured by their feature importance for a decision tree that is grown to its maximum depth. Naturally, the features that appear near the top of the tree also accumulate the highest importance scores.