Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling

de Carvalho, Gustavo Sutter Pessurno, Abdulrahman, Mohammed, Wang, Hao, Subramanian, Sriram Ganapathi, St-Aubin, Marc, O'Sullivan, Sharon, Wan, Lawrence, Ricardez-Sandoval, Luis, Poupart, Pascal, Kristiadi, Agustinus

arXiv.org Machine Learning 

The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an acquisition function, which generally require expensive re-training and optimization steps at each iteration, respectively. Although recent work enabled in-context surrogate models that do not require re-training, virtually all existing BO methods still require acquisition function maximization to select the next observation, which introduces many knobs to tune, such as Monte Carlo samplers and multi-start optimizers. In this work, we propose a completely in-context, zero-shot solution for BO that does not require surrogate fitting or acquisition function optimization. This is done by using a pre-trained deep generative model to directly sample from the posterior over the optimum point. We show that this process is equivalent to Thompson sampling and demonstrate the capabilities and cost-effectiveness of our foundation model on a suite of real-world benchmarks. We achieve an efficiency gain of more than 35x in terms of wall-clock time when compared with Gaussian process-based BO, enabling efficient parallel and distributed BO, e.g., for high-throughput optimization.