You use gridsearch for hyperparamter optimization and state that at some point you would like to add a bayesian approach. One simple change that could boost the performance, would be to use random search inplace of gridsearch. Grid search is known to perform worse than random search in cases where not all hyperparameters are of similar importance [1]. Intuitively, in grid search a lot of evaluations evaluate the same setting of an important hyperparameter with changing settings of not important ones.
It depends on the number of evaluations. The more evaluations the stronger the model built by the TPE algorithm. With very few evaluations, we would expect TPE to match random sampling. This effect can for example be seen in the plots of the "Bayesian Optimization and Hyperband" paper [1, 2], where the plotted "Bayesian Optimization" approach is TPE.
Also, there might be model bias: for example if the objective function is stochastic (e.g., a reinforcement learning algorithm that only converges sometimes) or not very smooth, TPE might exploit areas that are not actually good based on one good evaluation. In those cases TPE might perform worse than random search! To alleviate the effect of model bias in model based hyperparameter optimization (e.g., TPE) and to obtain convergence guarantees, people often sample every k-th hyperparameter setting from a prior distribution (random search) (this is also the case for the plots in [1, 2].
If you are wondering which HPO algorithm you should use (or HPO in general), I would highly recommend the first part of the AutoML tutorial at NeurIPS2018 [3] given by my advisor.
You use gridsearch for hyperparamter optimization and state that at some point you would like to add a bayesian approach. One simple change that could boost the performance, would be to use random search inplace of gridsearch. Grid search is known to perform worse than random search in cases where not all hyperparameters are of similar importance [1]. Intuitively, in grid search a lot of evaluations evaluate the same setting of an important hyperparameter with changing settings of not important ones.
[1] Section 1.3.1; Automatic Machine Learning: Methods, Systems, Challenges; Chapter Hyperparameter Optimization
https://www.automl.org/wp-content/uploads/2018/11/hpo.pdf