AITopics | length scale

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal

Neural Information Processing SystemsMar-21-2026, 19:18:06 GMT

Bayesian Optimization (BO) is widely used for optimising black-box functions but requires us to specify the length scale hyperparameter, which defines the smoothness of the functions the optimizer will consider. Most current BO algorithms choose this hyperparameter by maximizing the marginal likelihood of the observed data, albeit risking misspecification if the objective function is less smooth in regions we have not yet explored. The only prior solution addressing this problem with theoretical guarantees was A-GP-UCB, proposed by Berkenkamp et al. (2019). This algorithm progressively decreases the length scale, expanding the class of functions considered by the optimizer. However, A-GP-UCB lacks a stopping mechanism, leading to over-exploration and slow convergence.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

bf499a12e998d178afd964adf64a60cb-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 22:20:06 GMT

covariance, kernel, weak stationarity, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

6a0480190bbe6b622c7f1d3aa9be9c0f-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 08:00:24 GMT

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country: Atlantic Ocean (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Appendix

Neural Information Processing SystemsFeb-11-2026, 07:11:46 GMT

Fitting T1-mGPLVM to the binned spike data, we found that the inferred latent state was highly correlated with the true head direction (Figure 5b). Here we make this connection more explicit. As described in the main text, the Lie algebrag of a groupG is a vector space tangent toG at its identity element. However,because the Lie algebra is isomorphic toRn, we have found it convenient in both our exposition and our implementation to work directly with the pair(Rn,ExpG), instead of(g,expG). We begin by noting thatSn is not a Lie group unlessn = 1 or n = 3, thus we can only apply the ReLie framework toS1 and S3.

artificial intelligence, arxivpreprintarxiv, neuron, (17 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel > Southern District > Eilat (0.04)

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

7 Supplementary Material

Neural Information Processing SystemsOct-8-2025, 20:29:03 GMT

The sample explanatory features were fed into a multi-layer perceptron, then the learned latent features and sample spatial locations were fed into a Gaussian process model. GP variance is used as the uncertainty measure. We first constructed a spatial graph based on each sample's k-nearest-neighbor by spatial distance. The model contains two GCN layers. It contains a multi-level graph neural network to capture the long-range interactions among particles with linear complexity.

artificial intelligence, dataset, machine learning, (18 more...)

Neural Information Processing Systems

Country: Atlantic Ocean (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

We would like to thank the reviewers for taking the time to provide us with helpful feedback and will definitely

Neural Information Processing SystemsAug-20-2025, 01:01:57 GMT

Below are our clarifications for the questions raised. The stability of the factorization under other kernels will be important to study in future work. Therefore, it may not make sense to compare LFP channels of different rats. In Section 3.3, the horseshoe prior is used on the loadings as an illustration of the methodology when The effective sample size is 276 (median) for the loadings but falls short for the length scales. Bayesian hypothesis testing in general and the null hypothesis formulation in this case are not well-defined.

kernel, reviewer, weak stationarity, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

NeurIPS_wappendix

Christoffer Riis

Neural Information Processing SystemsAug-14-2025, 18:44:09 GMT

For the GP, the bias-variance trade-off is made by optimization of the two hyperparameters: the length scale and noise-term.

acquisition function, hyperparameter, posterior, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

An Empirical Bernstein Inequality for Dependent Data in Hilbert Spaces and Applications

Mirzaei, Erfan, Maurer, Andreas, Kostic, Vladimir R., Pontil, Massimiliano

arXiv.org Machine LearningJul-11-2025

Learning from non-independent and non-identically distributed data poses a persistent challenge in statistical learning. In this study, we introduce data-dependent Bernstein inequalities tailored for vector-valued processes in Hilbert space. Our inequalities apply to both stationary and non-stationary processes and exploit the potential rapid decay of correlations between temporally separated variables to improve estimation. We demonstrate the utility of these bounds by applying them to covariance operator estimation in the Hilbert-Schmidt norm and to operator learning in dynamical systems, achieving novel risk bounds. Finally, we perform numerical experiments to illustrate the practical implications of these bounds in both contexts.

artificial intelligence, inequality, machine learning, (19 more...)

arXiv.org Machine Learning

2507.07826

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal

Neural Information Processing SystemsMay-27-2025, 10:33:23 GMT

Bayesian Optimization (BO) is widely used for optimising black-box functions but requires us to specify the length scale hyperparameter, which defines the smoothness of the functions the optimizer will consider. Most current BO algorithms choose this hyperparameter by maximizing the marginal likelihood of the observed data, albeit risking misspecification if the objective function is less smooth in regions we have not yet explored. The only prior solution addressing this problem with theoretical guarantees was A-GP-UCB, proposed by Berkenkamp et al. (2019). This algorithm progressively decreases the length scale, expanding the class of functions considered by the optimizer. However, A-GP-UCB lacks a stopping mechanism, leading to over-exploration and slow convergence.

bayesian optimisation, regret bound logarithmically, unknown hyperparameter, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Data-driven Approach for Interpolation of Sparse Data

Ferguson, R. F., Ireland, D. G., McKinnon, B.

arXiv.org Machine LearningMay-6-2025

Extracting information about hadron resonances requires fitting theoretical models to experimental data. However, this data often comes from different experiments of different physics quantities in varying kinematic regions; studying coupled channels with different kinematic coverages and binning can make direct comparison challenging. The consistency of these datasets directly impacts the quality of the fit, thus making it difficult to accurately constrain the theoretical models. Sparse datasets in key kinematic regions further complicates the quantification of uncertainties, often requiring arbitrary weighting that may introduce bias. A robust approach to solving these problems involves utilising Gaussian Processes (GPs), a Bayesian inference machine learning technique that provides probabilistic predictions for unknown datapoints. Unlike traditional machine learning methods, GPs do not require any training; instead, they operate on three fundamental assumptions: 1. Some kernel function can be defined to measure the covariance between known datapoints; 2. This same kernel function can be used to predict the covariance between unknown datapoints; 3. Some idea of the form of the posterior distribution is known (e.g.

artificial intelligence, datapoint, machine learning, (18 more...)

arXiv.org Machine Learning

2505.01473

Country:

Europe > United Kingdom (0.14)
North America > United States > Virginia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)

Add feedback

Filters

Collaborating Authors

length scale

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal

bf499a12e998d178afd964adf64a60cb-AuthorFeedback.pdf

6a0480190bbe6b622c7f1d3aa9be9c0f-Supplemental-Conference.pdf

Appendix

7 Supplementary Material

We would like to thank the reviewers for taking the time to provide us with helpful feedback and will definitely

NeurIPS_wappendix

An Empirical Bernstein Inequality for Dependent Data in Hilbert Spaces and Applications

Bayesian Optimisation with Unknown Hyperparameters: Regret Bounds Logarithmically Closer to Optimal

Data-driven Approach for Interpolation of Sparse Data