AITopics | Matsubara, Takuo

Collaborating Authors

Matsubara, Takuo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Wasserstein Gradient Boosting: A General Framework with Applications to Posterior Regression

Matsubara, Takuo

arXiv.org Machine LearningMay-15-2024

Gradient boosting is a sequential ensemble method that fits a new base learner to the gradient of the remaining loss at each step. We propose a novel family of gradient boosting, Wasserstein gradient boosting, which fits a new base learner to an exactly or approximately available Wasserstein gradient of a loss functional on the space of probability distributions. Wasserstein gradient boosting returns a set of particles that approximates a target probability distribution assigned at each input. In probabilistic prediction, a parametric probability distribution is often specified on the space of output variables, and a point estimate of the output-distribution parameter is produced for each input by a model. Our main application of Wasserstein gradient boosting is a novel distributional estimate of the output-distribution parameter, which approximates the posterior distribution over the output-distribution parameter determined pointwise at each data point. We empirically demonstrate the superior performance of the probabilistic prediction by Wasserstein gradient boosting in comparison with various existing methods.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

2405.09536

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry: Energy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generalised Bayesian Inference for Discrete Intractable Likelihood

Matsubara, Takuo, Knoblauch, Jeremias, Briol, François-Xavier, Oates, Chris. J.

arXiv.org Machine LearningSep-1-2023

Discrete state spaces represent a major computational challenge to statistical inference, since the computation of normalisation constants requires summation over large or possibly infinite sets, which can be impractical. This paper addresses this computational challenge through the development of a novel generalised Bayesian inference procedure suitable for discrete intractable likelihood. Inspired by recent methodological advances for continuous data, the main idea is to update beliefs about model parameters using a discrete Fisher divergence, in lieu of the problematic intractable likelihood. The result is a generalised posterior that can be sampled from using standard computational tools, such as Markov chain Monte Carlo, circumventing the intractable normalising constant. The statistical properties of the generalised posterior are analysed, with sufficient conditions for posterior consistency and asymptotic normality established. In addition, a novel and general approach to calibration of generalised posteriors is proposed. Applications are presented on lattice models for discrete spatial data and on multivariate models for count data, where in each case the methodology facilitates generalised Bayesian inference at low computational cost.

artificial intelligence, machine learning, posterior, (15 more...)

arXiv.org Machine Learning

2206.0842

Country:

Europe > United Kingdom > England (0.28)
North America > United States (0.27)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

TCE: A Test-Based Approach to Measuring Calibration Error

Matsubara, Takuo, Tax, Niek, Mudd, Richard, Guy, Ido

arXiv.org Artificial IntelligenceJun-25-2023

While a number of metrics--such as log-likelihood, userspecified This paper proposes a new metric to measure the scoring functions, and the area under the receiver calibration error of probabilistic binary classifiers, operating characteristic (ROC) curve--are used to assess the called test-based calibration error (TCE). TCE incorporates quality of probabilistic classifiers, it is usually hard or even a novel loss function based on a statistical impossible to gauge whether predictions are well-calibrated test to examine the extent to which model predictions from the values of these metrics. For assessment of calibration, differ from probabilities estimated from it is typically necessary to use a metric that measures data. It offers (i) a clear interpretation, (ii) a consistent calibration error, that is, a deviation between model predictions scale that is unaffected by class imbalance, and and probabilities of target occurrences estimated from (iii) an enhanced visual representation with repect data. The importance of assessing calibration error has been to the standard reliability diagram. In addition, we long emphasised in machine learning [Nixon et al., 2019, introduce an optimality criterion for the binning Minderer et al., 2021] and in probabilistic forecasting more procedure of calibration error metrics based on a broadly [Dawid, 1982, Degroot and Fienberg, 1983].

data mining, machine learning, tce, (15 more...)

arXiv.org Artificial Intelligence

2306.14343

Country:

Europe (0.28)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.94)
(3 more...)

Add feedback

Robust Generalised Bayesian Inference for Intractable Likelihoods

Matsubara, Takuo, Knoblauch, Jeremias, Briol, François-Xavier, Oates, Chris. J.

arXiv.org Machine LearningApr-15-2021

Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible misspecification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In this context, the Stein discrepancy circumvents evaluation of the normalisation constant and produces generalised posteriors that are either closed form or accessible using standard Markov chain Monte Carlo. On a theoretical level, we show consistency, asymptotic normality, and bias-robustness of the generalised posterior, highlighting how these properties are impacted by the choice of Stein discrepancy. Then, we provide numerical experiments on a range of intractable distributions, including applications to kernel-based exponential family models and non-Gaussian graphical models.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

2104.07359

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

The Ridgelet Prior: A Covariance Function Approach to Prior Specification for Bayesian Neural Networks

Matsubara, Takuo, Oates, Chris J., Briol, François-Xavier

arXiv.org Machine LearningOct-16-2020

Bayesian neural networks attempt to combine the strong predictive performance of neural networks with formal quantification of uncertainty associated with the predictive output in the Bayesian framework. However, it remains unclear how to endow the parameters of the network with a prior distribution that is meaningful when lifted into the output space of the network. A possible solution is proposed that enables the user to posit an appropriate covariance function for the task at hand. Our approach constructs a prior distribution for the parameters of the network, called a ridgelet prior, that approximates the posited covariance structure in the output space of the network. The approach is rooted in the ridgelet transform and we establish both finite-sample-size error bounds and the consistency of the approximation of the covariance function in a limit where the number of hidden units is increased. Our experimental assessment is limited to a proof-of-concept, where we demonstrate that the ridgelet prior can out-perform an unstructured prior on regression problems for which an informative covariance function can be a priori provided.

bayesian inference, bnn, neural network, (19 more...)

arXiv.org Machine Learning

2010.08488

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Integral representation of the global minimizer

Sonoda, Sho, Ishikawa, Isao, Ikeda, Masahiro, Hagihara, Kei, Sawano, Yoshihiro, Matsubara, Takuo, Murata, Noboru

arXiv.org Machine LearningMay-19-2018

We have obtained an integral representation of the shallow neural network that attains the global minimum of its backpropagation (BP) training problem. According to our unpublished numerical simulations conducted several years prior to this study, we had noticed that such an integral representation may exist, but it was not proven until today. First, we introduced a Hilbert space of coefficient functions, and a reproducing kernel Hilbert space (RKHS) of hypotheses, associated with the integral representation. The RKHS reflects the approximation ability of neural networks. Second, we established the ridgelet analysis on RKHS. The analytic property of the integral representation is remarkably clear. Third, we reformulated the BP training as the optimization problem in the space of coefficient functions, and obtained a formal expression of the unique global minimizer, according to the Tikhonov regularization theory. Finally, we demonstrated that the global minimizer is the shrink ridgelet transform. Since the relation between an integral representation and an ordinary finite network is not clear, and BP is convex in the integral representation, we cannot immediately answer the question such as "Is a local minimum a global minimum?" However, the obtained integral representation provides an explicit expression of the global minimizer, without linearity-like assumptions, such as partial linearity and monotonicity. Furthermore, it indicates that the ordinary ridgelet transform provides the minimum norm solution to the original training equation.

artificial intelligence, integral representation, neural network, (15 more...)

arXiv.org Machine Learning

1805.07517

Country: Asia > Japan (0.15)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback