Hyperband

Hyperband is a hyperparameter optimization technique that was developed at Berkley in 2016 by Lisha Li, Kevin Jamieson, Guilia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalker. You can read their original paper at https://arxiv.org/pdf/1603.06560.pdf.

Imagine randomly sampling many potential sets of hyperparameters, as we did above in RandomSearchCV. When RandomSearchCV is done, it will have chosen one single hyperparameter configuration as the best among those it sampled. Hyperband exploits the idea that a best hyperparameter configuration is likely to outperform other configurations after even a small number of iterations. The band in Hyperband comes from bandit, referring back to exploration versus exploitation based on multi-arm bandit techniques (techniques used to optimize resource allocation between competing choices with the goal of optimizing performance).

Using Hyperband, we might try some set of possible configurations (n), training for only one iteration. The authors leave the term iteration open for multiple possible uses; however, I'll be using epochs as iterations. Once this first loop of training is complete, the resulting configurations are sorted by performance. The top half of this list is then trained for a larger number of iterations. This process of halving and culling is then repeated and we arrive at some very small set of configurations that we will train for full number of iterations we've defined in our search. This process gets us to a best set of hyperparameters in a shorter time than searching every possible configuration for max epochs.

In the GitHub repository for this chapter, I've included an implementation of the hyperband algorithm, in hyperband.py. This implementation is mostly derived from an implementation by FastML, which you can find at http://fastml.com/tuning-hyperparams-fast-with-hyperband/. To use it, you need to start by instantiating a hyperband object, as shown in the following code:

from hyperband import Hyperband
hb = Hyperband(data, get_params, try_params)

The Hyperband constructor requires three arguments:

  • data: The data dictionary that I've been using thus far in the examples
  • get_params: The name of a function that is used to sample from the hyperparameter space we are searching
  • try_param: The name of a function that can be used to evaluate a hyperparameter configuration for n_iter iterations and return the loss

In the following example, I implement get_params to sample in a uniform way across the parameter space:

def get_params():
batches = np.random.choice([5, 10, 100])
optimizers = np.random.choice(['rmsprop', 'adam', 'adadelta'])
dropout = np.random.choice(np.linspace(0.1, 0.5, 10))
return {"batch_size": batches, "optimizer": optimizers,
"keep_prob": dropout}

As you can see, the selected hyperparameter configuration is returned as a dictionary.

Next, try_params can be implemented to fit a model for a specified number of iterations on a hyperparameter configuration, as follows:

def try_params(data, num_iters, hyperparameters):
model = build_network(keep_prob=hyperparameters["keep_prob"],
optimizer=hyperparameters["optimizer"])
model.fit(x=data["train_X"], y=data["train_y"],
batch_size=hyperparameters["batch_size"],
epochs=int(num_iters))
loss = model.evaluate(x=data["val_X"], y=data["val_y"], verbose=0)
return {"loss": loss}

The try_params function returns a dictionary that can be used to keep track of any number of metrics; however, loss is required as it's used to compare runs.

The hyperband object will run through the algorithm we described above by calling the .run() method on it.

results = hb.run()

In this caseresults will be a dictionary of each run, its runtime, and the hyperparameters tested. Because even this highly optimized search is time-intensive, and because GPU time is expensive, I've included results from the MNIST search in hyperband-output-mnist.txt in the GitHub repository for this chapter, which can be found here: https://github.com/mbernico/deep_learning_quick_reference/tree/master/chapter_6.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.165.62