How it works...

In Step 2, we defined a set of parameters that we use in this recipe (number of folds for cross-validation, the maximum number of iterations in the optimization procedure). Then, we imported the dataset and created the training and test sets. We used the same dataset as in the previous recipe, so please refer to it for a description.

In Step 4, we defined the true objective function (the one for which the Bayesian optimization will create a surrogate). The function takes the set of hyperparameters as inputs and uses stratified 5-fold cross-validation to calculate the loss value to be minimized. In the case of fraud detection, we want to detect as much fraud as possible, even if it means creating more false positives. That is why, in this case, we selected recall as the metric we optimized for. As the optimizer will minimize the function, we multiplied it by -1 to create a maximization problem. The function must return either a single value (the loss) or a dictionary, with at least two key-value pairs:

'loss`: The value of the true objective function.
'status`: An indicator that the loss value was calculated correctly. It can be either STATUS_OK or STATUS_FAIL.

Additionally, we returned the set of hyperparameters used for evaluating the objective function.

In Step 5, we defined the hyperparameter grid over which we wanted to conduct the search. The search space is defined as a dictionary, but in comparison to the spaces defined for GridSearchCV, we used hyperopt's built-in functions, such as the following:

hp.choice(label, list): Returns one of the indicated options
hp.uniform(label, lower_value, upper_value): The uniform distribution between two values
hp.randint(label, upper_value): Returns a random integer in the range [0, upper_value)
hp.normal(label, mu, sigma): Returns a normally distributed value with mean mu and standard deviation sigma

Bear in mind that in this setup we need to define the names (label) of the hyperparameters twice.

We can make use of hp.choice to define conditional nested spaces, where the values of some hyperparameters depend on others. A possible use case is defining a space that considers multiple classifiers and their respective hyperparameters.

In Step 6, we ran the Bayesian optimization. First, we defined the Trials object, which is used for storing the history of the search. We can even use it to resume a search or expand an already finished one, that is, increase the number of iterations using the already stored history. Second, we ran the optimization by passing the objective function, the search space, the algorithm (for more details on tuning the TPE algorithm, please refer to hyperopt's documentation), the maximum number of iterations, and the trials for storing the history.

In Step 7, we created multiple dictionaries, mapping integer values to the values from the search space we defined in Step 5. We did this because for hyperparameters defined using the hp.choice function, the results of the optimization are returned as integer-encoded values (encoded in order defined in the search space). Then, we used these dictionaries to train a LightGBM classifier with the best set of hyperparameters on the entire training set.

In the last step, we evaluated the results of the model using the custom performance_evaluation_report function.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...