AITopics | student-t process

We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations.

artificial intelligence, machine learning, sequence, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Student-t processes as infinite-width limits of posterior Bayesian neural networks

Caporali, Francesco, Favaro, Stefano, Trevisan, Dario

arXiv.org Machine LearningFeb-6-2025

The asymptotic properties of Bayesian Neural Networks (BNNs) have been extensively studied, particularly regarding their approximations by Gaussian processes in the infinite-width limit. We extend these results by showing that posterior BNNs can be approximated by Student-t processes, which offer greater flexibility in modeling uncertainty. Specifically, we show that, if the parameters of a BNN follow a Gaussian prior distribution, and the variance of both the last hidden layer and the Gaussian likelihood function follows an Inverse-Gamma prior distribution, then the resulting posterior BNN converges to a Student-t process in the infinite-width limit. Our proof leverages the Wasserstein metric to establish control over the convergence rate of the Student-t process approximation.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2502.04247

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Information Geometry and Beta Link for Optimizing Sparse Variational Student-t Processes

Xu, Jian, Zeng, Delu, Paisley, John

arXiv.org Artificial IntelligenceAug-13-2024

Recently, a sparse version of Student-t Processes, termed sparse variational Student-t Processes, has been proposed to enhance computational efficiency and flexibility for real-world datasets using stochastic gradient descent. However, traditional gradient descent methods like Adam may not fully exploit the parameter space geometry, potentially leading to slower convergence and suboptimal performance. To mitigate these issues, we adopt natural gradient methods from information geometry for variational parameter optimization of Student-t Processes. This approach leverages the curvature and structure of the parameter space, utilizing tools such as the Fisher information matrix which is linked to the Beta function in our model. This method provides robust mathematical support for the natural gradient algorithm when using Student's t-distribution as the variational distribution. Additionally, we present a mini-batch algorithm for efficiently computing natural gradients. Experimental results across four benchmark datasets demonstrate that our method consistently accelerates convergence speed.

algorithm, dataset, matrix, (12 more...)

arXiv.org Artificial Intelligence

2408.06699

Country: Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.77)

Add feedback

Sparse Variational Student-t Processes

Xu, Jian, Zeng, Delu

arXiv.org Artificial IntelligenceDec-9-2023

The theory of Bayesian learning incorporates the use of Student-t Processes to model heavy-tailed distributions and datasets with outliers. However, despite Student-t Processes having a similar computational complexity as Gaussian Processes, there has been limited emphasis on the sparse representation of this model. This is mainly due to the increased difficulty in modeling and computation compared to previous sparse Gaussian Processes. Our motivation is to address the need for a sparse representation framework that reduces computational complexity, allowing Student-t Processes to be more flexible for real-world datasets. To achieve this, we leverage the conditional distribution of Student-t Processes to introduce sparse inducing points. Bayesian methods and variational inference are then utilized to derive a well-defined lower bound, facilitating more efficient optimization of our model through stochastic gradient descent. We propose two methods for computing the variational lower bound, one utilizing Monte Carlo sampling and the other employing Jensen's inequality to compute the KL regularization term in the loss function. We propose adopting these approaches as viable alternatives to Gaussian processes when the data might contain outliers or exhibit heavy-tailed behavior, and we provide specific recommendations for their applicability. We evaluate the two proposed approaches on various synthetic and real-world datasets from UCI and Kaggle, demonstrating their effectiveness compared to baseline methods in terms of computational complexity and accuracy, as well as their robustness to outliers.

dataset, gaussian process, student-t process, (13 more...)

arXiv.org Artificial Intelligence

2312.05568

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

One-parameter family of acquisition functions for efficient global optimization

Kanazawa, Takuya

arXiv.org Artificial IntelligenceApr-26-2021

In diverse fields of science and engineering, one frequently faces the need to know the optimum of a black-box function that is expensive to evaluate. In materials science, in order to determine an optimal composition of alloys one has to repeat manual experiments that cost time and money. In machine learning model building, one has to tune a number of hyperparameters of a model but testing the performance of a model on big data via cross validation takes hours or even days. Thus, a framework is needed that provides a systematic means to minimize the number of queries needed to reach the optimal solution. Bayesian optimization (BO) [1-3] is a powerful methodology to seek an optimum of a black-box function without knowledge of its analytical properties, such as its gradient.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN55064.2022.9892219

2104.12363

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BRUNO: A Deep Recurrent Model for Exchangeable Data

Korshunova, Iryna, Degrave, Jonas, Huszar, Ferenc, Gal, Yarin, Gretton, Arthur, Dambre, Joni

Neural Information Processing SystemsDec-31-2018

We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations. Our model is provably exchangeable, meaning that the joint distribution over observations is invariant under permutation: this property lies at the heart of Bayesian inference. The model does not require variational approximations to train, and new samples can be generated conditional on previous samples, with cost linear in the size of the conditioning set. The advantages of our architecture are demonstrated on learning tasks that require generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, and anomaly detection.

Add feedback

BRUNO: A Deep Recurrent Model for Exchangeable Data

Korshunova, Iryna, Degrave, Jonas, Huszar, Ferenc, Gal, Yarin, Gretton, Arthur, Dambre, Joni

Neural Information Processing SystemsDec-31-2018

We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations. Our model is provably exchangeable, meaning that the joint distribution over observations is invariant under permutation: this property lies at the heart of Bayesian inference. The model does not require variational approximations to train, and new samples can be generated conditional on previous samples, with cost linear in the size of the conditioning set. The advantages of our architecture are demonstrated on learning tasks that require generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, and anomaly detection.

Add feedback

A Generative Deep Recurrent Model for Exchangeable Data

Korshunova, Iryna, Degrave, Jonas, Huszár, Ferenc, Gal, Yarin, Gretton, Arthur, Dambre, Joni

arXiv.org Machine LearningFeb-21-2018

We present a novel model architecture which leverages deep learning tools to perform exact Bayesian inference on sets of high dimensional, complex observations. Our model is provably exchangeable, meaning that the joint distribution over observations is invariant under permutation: this property lies at the heart of Bayesian inference. The model does not require variational approximations to train, and new samples can be generated conditional on previous samples, with cost linear in the size of the conditioning set. The advantages of our architecture are demonstrated on learning tasks requiring generalisation from short observed sequences while modelling sequence variability, such as conditional image generation, few-shot learning, set completion, and anomaly detection.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

1802.07535

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Practical Bayesian optimization in the presence of outliers

Martinez-Cantin, Ruben, Tee, Kevin, McCourt, Michael

arXiv.org Machine LearningDec-12-2017

Inference in the presence of outliers is an important field of research as outliers are ubiquitous and may arise across a variety of problems and domains. Bayesian optimization is method that heavily relies on probabilistic inference. This allows outstanding sample efficiency because the probabilistic machinery provides a memory of the whole optimization process. However, that virtue becomes a disadvantage when the memory is populated with outliers, inducing bias in the estimation. In this paper, we present an empirical evaluation of Bayesian optimization methods in the presence of outliers. The empirical evidence shows that Bayesian optimization with robust regression often produces suboptimal results. We then propose a new algorithm which combines robust regression (a Gaussian process with Student-t likelihood) with outlier diagnostics to classify data points as outliers or inliers. By using an scheduler for the classification of outliers, our method is more efficient and has better convergence over the standard robust regression. Furthermore, we show that even in controlled situations with no expected outliers, our method is able to produce better results.

artificial intelligence, machine learning, outlier, (19 more...)

arXiv.org Machine Learning

1712.04567

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(2 more...)

Add feedback

Hypervolume-based Multi-objective Bayesian Optimization with Student-t Processes

van der Herten, Joachim, Couckuyt, Ivo, Dhaene, Tom

arXiv.org Machine LearningDec-1-2016

Student-$t$ processes have recently been proposed as an appealing alternative non-parameteric function prior. They feature enhanced flexibility and predictive variance. In this work the use of Student-$t$ processes are explored for multi-objective Bayesian optimization. In particular, an analytical expression for the hypervolume-based probability of improvement is developed for independent Student-$t$ process priors of the objectives. Its effectiveness is shown on a multi-objective optimization problem which is known to be difficult with traditional Gaussian processes.

artificial intelligence, machine learning, optimization, (13 more...)

arXiv.org Machine Learning

1612.00393

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

student-t process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

BRUNO: A Deep Recurrent Model for Exchangeable Data

Student-t processes as infinite-width limits of posterior Bayesian neural networks

Information Geometry and Beta Link for Optimizing Sparse Variational Student-t Processes

Sparse Variational Student-t Processes

One-parameter family of acquisition functions for efficient global optimization

BRUNO: A Deep Recurrent Model for Exchangeable Data

BRUNO: A Deep Recurrent Model for Exchangeable Data

A Generative Deep Recurrent Model for Exchangeable Data

Practical Bayesian optimization in the presence of outliers

Hypervolume-based Multi-objective Bayesian Optimization with Student-t Processes