AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Convergence Rates for Gaussian Mixtures of Experts

Ho, Nhat, Yang, Chiao-Yu, Jordan, Michael I.

arXiv.org Machine LearningJul-9-2019

We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a novel notion of \emph{algebraic independence} of the expert functions. Drawing on optimal transport theory, we establish a connection between the algebraic independence and a certain class of partial differential equations (PDEs). Exploiting this connection allows us to derive convergence rates and minimax lower bounds for parameter estimation.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1907.04377

Country: North America (0.27)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

Bayesian deep learning with hierarchical prior: Predictions from limited and noisy data

Luo, Xihaier, Kareem, Ahsan

arXiv.org Machine LearningJul-8-2019

Datasets in engineering applications are often limited and contaminated, mainly due to unavoidable measurement noise and signal distortion. Thus, using conventional data-driven approaches to build a reliable discriminative model, and further applying this identified surrogate to uncertainty analysis remains to be very challenging. A deep learning approach is presented to provide predictions based on limited and noisy data. To address noise perturbation, the Bayesian learning method that naturally facilitates an automatic updating mechanism is considered to quantify and propagate model uncertainties into predictive quantities. Specifically, hierarchical Bayesian modeling (HBM) is first adopted to describe model uncertainties, which allows the prior assumption to be less subjective, while also makes the proposed surrogate more robust. Next, the Bayesian inference is seamlessly integrated into the DL framework, which in turn supports probabilistic programming by yielding a probability distribution of the quantities of interest rather than their point estimates. Variational inference (VI) is implemented for the posterior distribution analysis where the intractable marginalization of the likelihood function over parameter space is framed in an optimization format, and stochastic gradient descent method is applied to solve this optimization problem. Finally, Monte Carlo simulation is used to obtain an unbiased estimator in the predictive phase of Bayesian inference, where the proposed Bayesian deep learning (BDL) scheme is able to offer confidence bounds for the output estimation by analyzing propagated uncertainties. The effectiveness of Bayesian shrinkage is demonstrated in improving predictive performance using contaminated data, and various examples are provided to illustrate concepts, methodologies, and algorithms of this proposed BDL modeling technique.

deep learning, posterior distribution, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

1907.0424

Country:

North America > United States (0.67)
Asia (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Asymptotic Bayes risk for Gaussian mixture in a semi-supervised setting

Lelarge, Marc, Miolane, Leo

arXiv.org Machine LearningJul-8-2019

Semi-supervised learning (SSL) uses unlabeled data for training and has been shown to greatly improve performances when compared to a supervised approach on the labeled data available. This claim depends both on the amount of labeled data available and on the algorithm used. In this paper, we compute analytically the gap between the best fully-supervised approach on labeled data and the best semi-supervised approach using both labeled and unlabeled data. We quantify the best possible increase in performance obtained thanks to the unlabeled data, i.e. we compute the accuracy increase due to the information contained in the unlabeled data. Our work deals with a simple high-dimensional Gaussian mixture model for the data in a Bayesian setting. Our rigorous analysis builds on recent theoretical breakthroughs in high-dimensional inference and a large body of mathematical tools from statistical physics initially developed for spin glasses.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1907.03792

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Variational Inference MPC for Bayesian Model-based Reinforcement Learning

Okada, Masashi, Taniguchi, Tadahiro

arXiv.org Machine LearningJul-7-2019

In recent studies on model-based reinforcement learning (MBRL), incorporating uncertainty in forward dynamics is a state-of-the-art strategy to enhance learning performance, making MBRLs competitive to cutting-edge model free methods, especially in simulated robotics tasks. Probabilistic ensembles with trajectory sampling (PETS) is a leading type of MBRL, which employs Bayesian inference to dynamics modeling and model predictive control (MPC) with stochastic optimization via the cross entropy method (CEM). In this paper, we propose a novel extension to the uncertainty-aware MBRL. Our main contributions are twofold: Firstly, we introduce a variational inference MPC, which reformulates various stochastic methods, including CEM, in a Bayesian fashion. Secondly, we propose a novel instance of the framework, called probabilistic action ensembles with trajectory sampling (PaETS). As a result, our Bayesian MBRL can involve multimodal uncertainties both in dynamics and optimal trajectories. In comparison to PETS, our method consistently improves asymptotic performance on several challenging locomotion tasks.

downstream oil & gas, neural network, trajectory, (19 more...)

arXiv.org Machine Learning

1907.04202

Genre:

Research Report > New Finding (0.94)
Research Report > Promising Solution (0.66)

Industry: Energy > Oil & Gas > Downstream (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss

Jehl, Laura, Lawrence, Carolin, Riezler, Stefan

arXiv.org Machine LearningJul-6-2019

In many machine learning scenarios, supervision by gold labels is not available and consequently neural models cannot be trained directly by maximum likelihood estimation (MLE). In a weak supervision scenario, metric-augmented objectives can be employed to assign feedback to model outputs, which can be used to extract a supervision signal for training. We present several objectives for two separate weakly supervised tasks, machine translation and semantic parsing. We show that objectives should actively discourage negative outputs in addition to promoting a surrogate gold structure. This notion of bipolarity is naturally present in ramp loss objectives, which we adapt to neural models. We show that bipolar ramp loss objectives outperform other non-bipolar ramp loss objectives and minimum risk training (MRT) on both weakly supervised tasks, as well as on a supervised machine translation task. Additionally, we introduce a novel token-level ramp loss objective, which is able to outperform even the best sequence-level ramp loss on both weakly supervised tasks.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1907.03748

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(20 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.89)
(3 more...)

Add feedback

An Approximate Bayesian Approach to Surprise-Based Learning

Liakoni, Vasiliki, Modirshanechi, Alireza, Gerstner, Wulfram, Brea, Johanni

arXiv.org Machine LearningJul-5-2019

Surprise-based learning allows agents to adapt quickly in non-stationary stochastic environments. Most existing approaches to surprise-based learning and change point detection assume either implicitly or explicitly a simple, hierarchical generative model of observation sequences that are characterized by stationary periods separated by sudden changes. In this work we show that exact Bayesian inference gives naturally rise to a surprise-modulated trade-off between forgetting and integrating the new observations with the current belief. We demonstrate that many existing approximate Bayesian approaches also show surprise-based modulation of learning rates, and we derive novel particle filters and variational filters with update rules that exhibit surprise-based modulation. Our derived filters have a constant scaling in observation sequence length and particularly simple update dynamics for any distribution in the exponential family. Empirical results show that these filters estimate parameters better than alternative approximate approaches and reach comparative levels of performance to computationally more expensive algorithms. The theoretical insight of casting various approaches under the same interpretation of surprise-based learning, as well as the proposed filters, may find useful applications in reinforcement learning in non-stationary environments and in the analysis of animal and human behavior.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1907.02936

Country: Europe (0.46)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Geodesic Learning via Unsupervised Decision Forests

Madhyastha, Meghana, Li, Percy, Browne, James, Strnadova-Neeley, Veronika, Priebe, Carey E., Burns, Randal, Vogelstein, Joshua T.

arXiv.org Machine LearningJul-5-2019

Geodesic distance is the shortest path between two points in a Riemannian manifold. Manifold learning algorithms, such as Isomap, seek to learn a manifold that preserves geodesic distances. However, such methods operate on the ambient dimensionality, and are therefore fragile to noise dimensions. We developed an unsupervised random forest method (URerF) to approximately learn geodesic distances in linear and nonlinear manifolds with noise. URerF operates on low-dimensional sparse linear combinations of features, rather than the full observed dimensionality. To choose the optimal split in a computationally efficient fashion, we developed a fast Bayesian Information Criterion statistic for Gaussian mixture models. We introduce geodesic precision-recall curves which quantify performance relative to the true latent manifold. Empirical results on simulated and real data demonstrate that URerF is robust to high-dimensional noise, where as other methods, such as Isomap, UMAP, and FLANN, quickly deteriorate in such settings. In particular, URerF is able to estimate geodesic distances on a real connectome dataset better than other approaches.

artificial intelligence, dimension, machine learning, (16 more...)

arXiv.org Machine Learning

1907.02844

Country: North America > United States > New York (0.28)

Genre: Research Report (0.50)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)
(2 more...)

Add feedback

Data-Centric Mixed-Variable Bayesian Optimization For Materials Design

Iyer, Akshay, Zhang, Yichi, Prasad, Aditya, Tao, Siyu, Wang, Yixing, Schadler, Linda, Brinson, L Catherine, Chen, Wei

arXiv.org Machine LearningJul-4-2019

Materials design can be cast as an optimization problem with the goal of achieving desired properties, by varying material composition, microstructure morphology, and processing conditions. Existence of both qualitative and quantitative material design variables leads to disjointed regions in property space, making the search for optimal design challenging. Limited availability of experimental data and the high cost of simulations magnify the challenge. This situation calls for design methodologies that can extract useful information from existing data and guide the search for optimal designs efficiently. To this end, we present a data-centric, mixed-variable Bayesian Optimization framework that integrates data from literature, experiments, and simulations for knowledge discovery and computational materials design. Our framework pivots around the Latent Variable Gaussian Process (LVGP), a novel Gaussian Process technique which projects qualitative variables on a continuous latent space for covariance formulation, as the surrogate model to quantify "lack of data" uncertainty. Expected improvement, an acquisition criterion that balances exploration and exploitation, helps navigate a complex, nonlinear design space to locate the optimum design. The proposed framework is tested through a case study which seeks to concurrently identify the optimal composition and morphology for insulating polymer nanocomposites. We also present an extension of mixed-variable Bayesian Optimization for multiple objectives to identify the Pareto Frontier within tens of iterations. These findings project Bayesian Optimization as a powerful tool for design of engineered material systems.

optimization, optimization problem, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1907.02577

Country:

North America > United States (0.28)
Europe > Germany (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Materials (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Multimodal Uncertainty Reduction for Intention Recognition in Human-Robot Interaction

Trick, Susanne, Koert, Dorothea, Peters, Jan, Rothkopf, Constantin

arXiv.org Machine LearningJul-4-2019

Assistive robots can potentially improve the quality of life and personal independence of elderly people by supporting everyday life activities. To guarantee a safe and intuitive interaction between human and robot, human intentions need to be recognized automatically. As humans communicate their intentions multimodally, the use of multiple modalities for intention recognition may not just increase the robustness against failure of individual modalities but especially reduce the uncertainty about the intention to be predicted. This is desirable as particularly in direct interaction between robots and potentially vulnerable humans a minimal uncertainty about the situation as well as knowledge about this actual uncertainty is necessary. Thus, in contrast to existing methods, in this work a new approach for multimodal intention recognition is introduced that focuses on uncertainty reduction through classifier fusion. For the four considered modalities speech, gestures, gaze directions and scene objects individual intention classifiers are trained, all of which output a probability distribution over all possible intentions. By combining these output distributions using the Bayesian method Independent Opinion Pool the uncertainty about the intention to be recognized can be decreased. The approach is evaluated in a collaborative human-robot interaction task with a 7-DoF robot arm. The results show that fused classifiers which combine multiple modalities outperform the respective individual base classifiers with respect to increased accuracy, robustness, and reduced uncertainty.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Machine Learning

1907.02426

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.62)

Add feedback

Probabilistic CCA with Implicit Distributions

Shi, Yaxin, Pan, Yuangang, Xu, Donna, Tsang, Ivor

arXiv.org Machine LearningJul-4-2019

Canonical Correlation Analysis (CCA) is a classic technique for multi-view data analysis. To overcome the deficiency of linear correlation in practical multi-view learning tasks, various CCA variants were proposed to capture nonlinear dependency. However, it is non-trivial to have an in-principle understanding of these variants due to their inherent restrictive assumption on the data and latent code distributions. Although some works have studied probabilistic interpretation for CCA, these models still require the explicit form of the distributions to achieve a tractable solution for the inference. In this work, we study probabilistic interpretation for CCA based on implicit distributions. We present Conditional Mutual Information (CMI) as a new criterion for CCA to consider both linear and nonlinear dependency for arbitrarily distributed data. To eliminate direct estimation for CMI, in which explicit form of the distributions is still required, we derive an objective which can provide an estimation for CMI with efficient inference methods. To facilitate Bayesian inference of multi-view analysis, we propose Adversarial CCA (ACCA), which achieves consistent encoding for multi-view data with the consistent constraint imposed on the marginalization of the implicit posteriors. Such a model would achieve superiority in the alignment of the multi-view data with implicit distributions. It is interesting to note that most of the existing CCA variants can be connected with our proposed CCA model by assigning specific form for the posterior and likelihood distributions. Extensive experiments on nonlinear correlation analysis and cross-view generation on benchmark and real-world datasets demonstrate the superiority of our model.

artificial intelligence, cca, machine learning, (18 more...)

arXiv.org Machine Learning

1907.02345

Country: Oceania > Australia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback