Latent Bayesian Optimization via Autoregressive Normalizing Flows

Lee, Seunghun, Park, Jinyoung, Chu, Jaewon, Yoon, Minseo, Kim, Hyunwoo J.

arXiv.org Artificial Intelligence 

Bayesian Optimization (BO) has been recognized for its effectiveness in optimizing expensive and complex objective functions. Recent advancements in Latent Bayesian Optimization (LBO) have shown promise by integrating generative models such as variational autoencoders (V AEs) to manage the complexity of high-dimensional and structured data spaces. However, existing LBO approaches often suffer from the value discrepancy problem, which arises from the reconstruction gap between input and latent spaces. To address this issue, we propose a Normalizing Flow-based Bayesian Optimization (NF-BO), which utilizes normalizing flow as a generative model to establish one-to-one encoding function from the input space to the latent space, along with its left-inverse decoding function, eliminating the reconstruction gap. Specifically, we introduce SeqFlow, an autoregressive normalizing flow for sequence data. In addition, we develop a new candidate sampling strategy that dynamically adjusts the exploration probability for each token based on its importance. Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches. Bayesian optimization (BO) (Kushner, 1962; 1964) has been broadly applied across various areas such as chemical design (Wang & Dowling, 2022), material science (Ament et al., 2021), and hy-perparameter optimization (Wu et al., 2019). BO aims to probabilistically optimize an expensive and black-box objective function using a surrogate model to find an optimal solution with minimal cost. Although BO is effective in continuous spaces, its application to a discrete input space still remains challenging (Oh et al., 2019; Deshwal & Doppa, 2021). Latent Bayesian Optimization (LBO) (G omez-Bombarelli et al., 2018; Tripp et al., 2020) addresses this challenge by performing BO in a lower-dimensional latent space learned by a generative model such as V ariational AutoEncoders (V AEs) (Kingma & Welling, 2014). LBO performs optimization in a continuous space by mapping the discrete input into a continuous latent space with the V AEs (Kusner et al., 2017; Jin et al., 2018; Samanta et al., 2019). However, the reconstruction of V AE is not always perfect, leading to value discrepancy problem, which indicates that given a sample encoded as an embedding in the latent space, its decoding may not result in the same sample in the input space.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found