Nice project! You use gridsearch for hyperparamter optimization and state that a...

MasterScrat · on April 5, 2019

What about HyperOpt?

human_scientist · on April 14, 2019

HyperOpt is a library for hyperparameter optimization, my comment was about algorithms. From the homepage of hyperopt:

"Currently two algorithms are implemented in hyperopt: Random Search Tree of Parzen Estimators (TPE)"

(TPE is a Bayesian Optimization algorithm)

MasterScrat · on April 18, 2019

Ah yes indeed, I thought that hyperopt was actually the name for the TPE algorithm. TPE should outperform random search in all cases, right?

human_scientist · on April 18, 2019

It depends on the number of evaluations. The more evaluations the stronger the model built by the TPE algorithm. With very few evaluations, we would expect TPE to match random sampling. This effect can for example be seen in the plots of the "Bayesian Optimization and Hyperband" paper [1, 2], where the plotted "Bayesian Optimization" approach is TPE.

Also, there might be model bias: for example if the objective function is stochastic (e.g., a reinforcement learning algorithm that only converges sometimes) or not very smooth, TPE might exploit areas that are not actually good based on one good evaluation. In those cases TPE might perform worse than random search! To alleviate the effect of model bias in model based hyperparameter optimization (e.g., TPE) and to obtain convergence guarantees, people often sample every k-th hyperparameter setting from a prior distribution (random search) (this is also the case for the plots in [1, 2].

If you are wondering which HPO algorithm you should use (or HPO in general), I would highly recommend the first part of the AutoML tutorial at NeurIPS2018 [3] given by my advisor.

[1] https://www.automl.org/blog_bohb/

[2] http://proceedings.mlr.press/v80/falkner18a.html

[3] https://nips.cc/Conferences/2018/Schedule?showEvent=10979