Multi-step Planning for Automated Hyperparameter Optimization with OptFormer
Unlike myopic HPO methods, planning based approaches fundamentally require building models of the future to assess the impact of a current decision on later timesteps. Though these methods also rely on a GP as a surrogate model, each point in multi-step planning involves fantasizing/imagining an updated GP posterior ( ft 1 xt),…,( ft h xt, xt 1,…, xt h 1) based on simulated choices from lookaheads {( xt, yt),…,( xt h 1, yt h 1)} (Lam et al., 2016; Jiang et al., 2020). Note that we use xt to represent a fantasized decision, while xt is the actual choice made at timestep t. Whilst multi-step planning is promising, constructing the posterior of a GP model requires matrix inversion which is a compute-intensive operation (Cormen et al., 2022). Even outside of this limitation, traditional planning based approaches are compute intensive due to (i) poor scaling behavior of the search tree--O(qh) where q is the number of choices at each decision point for each lookahead step (Lam et al., 2016; Lam and Willcox, 2017)--which forces most methods to explore short horizons, typically h {1,2}, and (ii) nested expectation and maximization: marginalizing future observation yt j,j h and global search on the acquisition function to obtain query xt j at every lookahead step.
Oct-13-2022, 02:06:20 GMT
- Technology: