AITopics | gaussian process bandit optimization

Collaborating Authors

gaussian process bandit optimization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective

Neural Information Processing SystemsDec-23-2025, 23:32:17 GMT

Achieving the full promise of the Thermodynamic Variational Objective (TVO), a recently proposed variational inference objective that lower-bounds the log evidence via one-dimensional Riemann integration, requires choosing a ``schedule'' of sorted discretization points. This paper introduces a bespoke Gaussian process bandit optimization method for automatically choosing these points. Our approach not only automates their one-time selection, but also dynamically adapts their positions over the course of optimization, leading to improved model learning and inference. We provide theoretical guarantees that our bandit optimization converges to the regret-minimizing choice of integration points. Empirical validation of our algorithm is provided in terms of improved learning and inference in Variational Autoencoders and sigmoid belief networks.

gaussian process bandit optimization, name change, thermodynamic variational objective, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

Batched Gaussian Process Bandit Optimization via Determinantal Point Processes

Tarun Kathuria, Amit Deshpande, Pushmeet Kohli

Neural Information Processing SystemsNov-21-2025, 08:31:30 GMT

Most methods for this so-called "Bayesian optimization" only allow sequential exploration of the parameter space.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective

Neural Information Processing SystemsOct-10-2024, 00:47:31 GMT

Achieving the full promise of the Thermodynamic Variational Objective (TVO), a recently proposed variational inference objective that lower-bounds the log evidence via one-dimensional Riemann integration, requires choosing a schedule'' of sorted discretization points. This paper introduces a bespoke Gaussian process bandit optimization method for automatically choosing these points. Our approach not only automates their one-time selection, but also dynamically adapts their positions over the course of optimization, leading to improved model learning and inference. We provide theoretical guarantees that our bandit optimization converges to the regret-minimizing choice of integration points. Empirical validation of our algorithm is provided in terms of improved learning and inference in Variational Autoencoders and sigmoid belief networks.

gaussian process bandit optimization, inference, thermodynamic variational objective

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Batched Gaussian Process Bandit Optimization via Determinantal Point Processes

Neural Information Processing SystemsMar-12-2024, 15:30:31 GMT

Gaussian Process bandit optimization has emerged as a powerful tool for optimizing noisy black box functions. One example in machine learning is hyper-parameter optimization where each evaluation of the target function may require training a model which may involve days or even weeks of computation. Most methods for this so-called "Bayesian optimization" only allow sequential exploration of the parameter space. However, it is often desirable to propose batches or sets of parameter values to explore simultaneously, especially when there are large parallel processing facilities at our disposal. Batch methods require modeling the interaction between the different evaluations in the batch, which can be expensive in complex scenarios. In this paper, we propose a new approach for parallelizing Bayesian optimization by modeling the diversity of a batch via Determinantal point processes (DPPs) whose kernels are learned automatically. This allows us to generalize a previous result as well as prove better regret bounds based on DPP sampling. Our experiments on a variety of synthetic and real-world robotics and hyper-parameter optimization tasks indicate that our DPP-based methods, especially those based on DPP sampling, outperform state-of-the-art methods.

algorithm, dpp-sample, optimization, (12 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Open Problem: Tight Online Confidence Intervals for RKHS Elements

Vakili, Sattar, Scarlett, Jonathan, Javidi, Tara

arXiv.org Machine LearningOct-28-2021

Confidence intervals are a crucial building block in the analysis of various online learning problems. The analysis of kernel based bandit and reinforcement learning problems utilize confidence intervals applicable to the elements of a reproducing kernel Hilbert space (RKHS). However, the existing confidence bounds do not appear to be tight, resulting in suboptimal regret bounds. In fact, the existing regret bounds for several kernelized bandit algorithms (e.g., GP-UCB, GP-TS, and their variants) may fail to even be sublinear. It is unclear whether the suboptimal regret bound is a fundamental shortcoming of these algorithms or an artifact of the proof, and the main challenge seems to stem from the online (sequential) nature of the observation points.

confidence interval, international conference, optimization, (12 more...)

arXiv.org Machine Learning

2110.15458

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.51)

Industry: Education > Focused Education > Special Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.49)

Add feedback

Lower Bounds on Regret for Noisy Gaussian Process Bandit Optimization

Scarlett, Jonathan, Bogunovic, Ilijia, Cevher, Volkan

arXiv.org Machine LearningJun-16-2017

In this paper, we consider the problem of sequentially optimizing a black-box function $f$ based on noisy samples and bandit feedback. We assume that $f$ is smooth in the sense of having a bounded norm in some reproducing kernel Hilbert space (RKHS), yielding a commonly-considered non-Bayesian form of Gaussian process bandit optimization. We provide algorithm-independent lower bounds on the simple regret, measuring the suboptimality of a single point reported after $T$ rounds, and on the cumulative regret, measuring the sum of regrets over the $T$ chosen points. For the isotropic squared-exponential kernel in $d$ dimensions, we find that an average simple regret of $\epsilon$ requires $T = \Omega\big(\frac{1}{\epsilon^2} (\log\frac{1}{\epsilon})^{d/2}\big)$, and the average cumulative regret is at least $\Omega\big( \sqrt{T(\log T)^d} \big)$, thus matching existing upper bounds up to the replacement of $d/2$ by $d+O(1)$ in both cases. For the Mat\'ern-$\nu$ kernel, we give analogous bounds of the form $\Omega\big( (\frac{1}{\epsilon})^{2+d/\nu}\big)$ and $\Omega\big( T^{\frac{\nu + d}{2\nu + d}} \big)$, and discuss the resulting gaps to the existing upper bounds.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1706.0009

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Data Science > Data Mining (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Batched Gaussian Process Bandit Optimization via Determinantal Point Processes

Kathuria, Tarun, Deshpande, Amit, Kohli, Pushmeet

Neural Information Processing SystemsDec-31-2016

Gaussian Process bandit optimization has emerged as a powerful tool for optimizing noisy black box functions. One example in machine learning is hyper-parameter optimization where each evaluation of the target function may require training a model which may involve days or even weeks of computation. Most methods for this so-called “Bayesian optimization” only allow sequential exploration of the parameter space. However, it is often desirable to propose batches or sets of parameter values to explore simultaneously, especially when there are large parallel processing facilities at our disposal. Batch methods require modeling the interaction between the different evaluations in the batch, which can be expensive in complex scenarios. In this paper, we propose a new approach for parallelizing Bayesian optimization by modeling the diversity of a batch via Determinantal point processes (DPPs) whose kernels are learned automatically. This allows us to generalize a previous result as well as prove better regret bounds based on DPP sampling. Our experiments on a variety of synthetic and real-world robotics and hyper-parameter optimization tasks indicate that our DPP-based methods, especially those based on DPP sampling, outperform state-of-the-art methods.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization

Desautels, Thomas, Krause, Andreas, Burdick, Joel

arXiv.org Machine LearningJun-27-2012

Can one parallelize complex exploration exploitation tradeoffs? As an example, consider the problem of optimal high-throughput experimental design, where we wish to sequentially design batches of experiments in order to simultaneously learn a surrogate function mapping stimulus to response and identify the maximum of the function. We formalize the task as a multi-armed bandit problem, where the unknown payoff function is sampled from a Gaussian process (GP), and instead of a single arm, in each round we pull a batch of several arms in parallel. We develop GP-BUCB, a principled algorithm for choosing batches, based on the GP-UCB algorithm for sequential GP optimization. We prove a surprising result; as compared to the sequential approach, the cumulative regret of the parallel algorithm only increases by a constant factor independent of the batch size B. Our results provide rigorous theoretical support for exploiting parallelism in Bayesian global optimization. We demonstrate the effectiveness of our approach on two real-world applications.

algorithm, optimization problem, upstream oil & gas, (16 more...)

arXiv.org Machine Learning

1206.6402

Country:

North America > United States > California (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Oil & Gas > Upstream (0.72)
Health & Medicine > Pharmaceuticals & Biotechnology (0.58)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.89)
Information Technology > Modeling & Simulation (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback