Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Manually searching is time taking since you need to wait for the results from each experiment. This becomes impossible when the number of hyperparameters is more than 8-10 and you will probably end up only tuning a few of them that you think are relevant. You'd also need a lot of experience in tuning hyperparameters else your tuning is as good as random.

Given these disadvantages of manual tuning, "Bayesian Optimization" seems like the most promising technique, it needs a lot less "choose->train->evals" loops as it uses the information from previous runs to select the next set of hyperparameters (similar to what humans would do).



It depends on how well the problem is understood. If the problem is your standard MNIST dataset then sure it could very well be a waste of time to sit around and serialize your manual hyper param search. For any new datasets which may or may not be cleaned theres much to be learned from iterating on a very small subset of the data, at that small scale it's much easier to get a handle on the major failings, such as encoding the wrong things or weight explosion.


Does it work in parallel though?


Sure, it does, it's not trivial though. Tedious to implement it yourself. You could use python libraries as "scikit-optimize" which has an implementation of parallel Bayesian optimization (based on Gaussian process), have a look at this: https://scikit-optimize.github.io/notebooks/bayesian-optimiz...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: