AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Online Structural Change-point Detection of High-dimensional Streaming Data via Dynamic Sparse Subspace Learning

Xu, Ruiyu, Wu, Jianguo, Yue, Xiaowei, Li, Yongxiang

arXiv.org Machine LearningSep-29-2020

High-dimensional streaming data are becoming increasingly ubiquitous in many fields. They often lie in multiple low-dimensional subspaces, and the manifold structures may change abruptly on the time scale due to pattern shift or occurrence of anomalies. However, the problem of detecting the structural changes in a real-time manner has not been well studied. To fill this gap, we propose a dynamic sparse subspace learning (DSSL) approach for online structural change-point detection of high-dimensional streaming data. A novel multiple structural change-point model is proposed and it is shown to be equivalent to maximizing a posterior under certain conditions. The asymptotic properties of the estimators are investigated. The penalty coefficients in our model can be selected by AMDL criterion based on some historical data. An efficient Pruned Exact Linear Time (PELT) based method is proposed for online optimization and change-point detection. The effectiveness of the proposed method is demonstrated through a simulation study and a real case study using gesture data for motion tracking.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2009.11713

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Communications > Networks (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Quantile Surfaces -- Generalizing Quantile Regression to Multivariate Targets

Bieshaar, Maarten, Schreiber, Jens, Vogt, Stephan, Gensler, André, Sick, Bernhard

arXiv.org Artificial IntelligenceSep-29-2020

In this article, we present a novel approach to multivariate probabilistic forecasting. Our approach is based on an extension of single-output quantile regression (QR) to multivariate-targets, called quantile surfaces (QS). QS uses a simple yet compelling idea of indexing observations of a probabilistic forecast through direction and vector length to estimate a central tendency. We extend the single-output QR technique to multivariate probabilistic targets. QS efficiently models dependencies in multivariate target variables and represents probability distributions through discrete quantile levels. Therefore, we present a novel two-stage process. In the first stage, we perform a deterministic point forecast (i.e., central tendency estimation). Subsequently, we model the prediction uncertainty using QS involving neural networks called quantile surface regression neural networks (QSNN). Additionally, we introduce new methods for efficient and straightforward evaluation of the reliability and sharpness of the issued probabilistic QS predictions. We complement this by the directional extension of the Continuous Ranked Probability Score (CRPS) score. Finally, we evaluate our novel approach on synthetic data and two currently researched real-world challenges in two different domains: First, probabilistic forecasting for renewable energy power generation, second, short-term cyclists trajectory forecasting for autonomously driving vehicles. Especially for the latter, our empirical results show that even a simple one-layer QSNN outperforms traditional parametric multivariate forecasting techniques, thus improving the state-of-the-art performance.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2010.05898

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(10 more...)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Energy > Power Industry (0.88)
Energy > Renewable > Wind (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(4 more...)

Add feedback

A Unifying Review of Deep and Shallow Anomaly Detection

Ruff, Lukas, Kauffmann, Jacob R., Vandermeulen, Robert A., Montavon, Grégoire, Samek, Wojciech, Kloft, Marius, Dietterich, Thomas G., Müller, Klaus-Robert

arXiv.org Artificial IntelligenceSep-28-2020

Deep learning approaches to anomaly detection have recently improved the state of the art in detection performance on complex datasets such as large collections of images or text. These results have sparked a renewed interest in the anomaly detection problem and led to the introduction of a great variety of new methods. With the emergence of numerous such methods, including approaches based on generative models, one-class classification, and reconstruction, there is a growing need to bring methods of this field into a systematic and unified perspective. In this review we aim to identify the common underlying principles as well as the assumptions that are often made implicitly by various methods. In particular, we draw connections between classic 'shallow' and novel deep approaches and show how this relation might cross-fertilize or extend both directions. We further provide an empirical assessment of major existing methods that is enriched by the use of recent explainability techniques, and present specific worked-through examples together with practical advice. Finally, we outline critical open challenges and identify specific paths for future research in anomaly detection.

anomaly detection, deep learning, law enforcement, (21 more...)

arXiv.org Artificial Intelligence

2009.11732

Country:

Europe > Germany (0.67)
North America > United States > New York (0.14)
Europe > United Kingdom > England (0.14)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Energy > Oil & Gas (1.00)
(5 more...)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)

Add feedback

CASTLE: Regularization via Auxiliary Causal Graph Discovery

Kyono, Trent, Zhang, Yao, van der Schaar, Mihaela

arXiv.org Machine LearningSep-28-2020

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2009.1318

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Learning an arbitrary mixture of two multinomial logits

Tang, Wenpin

arXiv.org Machine LearningSep-28-2020

In this paper, we consider mixtures of multinomial logistic models (MNL), which are known to $\epsilon$-approximate any random utility model. Despite its long history and broad use, rigorous results are only available for learning a uniform mixture of two MNLs. Continuing this line of research, we study the problem of learning an arbitrary mixture of two MNLs. We show that the identifiability of the mixture models may only fail on an algebraic variety of a negligible measure. This is done by reducing the problem of learning a mixture of two MNLs to the problem of solving a system of univariate quartic equations. We also devise an algorithm to learn any mixture of two MNLs using a polynomial number of samples and a linear number of queries, provided that a mixture of two MNLs over some finite universe is identifiable. Several numerical experiments and conjectures are also presented.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2007.00204

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > District of Columbia > Washington (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Improved High Dimensional Discrete Bayesian Network Inference using Triplet Region Construction

Lin, Peng ( Capital University of Economics and Business) | Neil, Martin | Fenton, Norman

Journal of Artificial Intelligence ResearchSep-27-2020

Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.

artificial intelligence, interaction triplet, machine learning, (17 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12198

AI Access Foundation

12198

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(15 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Optimal Representations with the Decodable Information Bottleneck

Dubois, Yann, Kiela, Douwe, Schwab, David J., Vedantam, Ramakrishna

arXiv.org Machine LearningSep-27-2020

We address the question of characterizing and finding optimal representations for supervised learning. Traditionally, this question has been tackled using the Information Bottleneck, which compresses the inputs while retaining information about the targets, in a decoder-agnostic fashion. In machine learning, however, our goal is not compression but rather generalization, which is intimately linked to the predictive family or decoder of interest (e.g. linear classifier). We propose the Decodable Information Bottleneck (DIB) that considers information retention and compression from the perspective of the desired predictive family. As a result, DIB gives rise to representations that are optimal in terms of expected test performance and can be estimated with guarantees. Empirically, we show that the framework can be used to enforce a small generalization gap on downstream classifiers and to predict the generalization ability of neural networks.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

2009.12789

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(25 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Bayesian Restoration of Audio Degraded by Low-Frequency Pulses Modeled via Gaussian Process

de Carvalho, Hugo Tremonte, Ávila, Flávio Rainho, Biscainho, Luiz Wagner Pereira

arXiv.org Machine LearningSep-26-2020

A common defect found when reproducing old vinyl and gramophone recordings with mechanical devices are the long pulses with significant low-frequency content caused by the interaction of the arm-needle system with deep scratches or even breakages on the media surface. Previous approaches to their suppression on digital counterparts of the recordings depend on a prior estimation of the pulse location, usually performed via heuristic methods. This paper proposes a novel Bayesian approach capable of jointly estimating the pulse location; interpolating the almost annihilated signal underlying the strong discontinuity that initiates the pulse; and also estimating the long pulse tail by a simple Gaussian Process, allowing its suppression from the corrupted signal. The posterior distribution for the model parameters as well for the pulse is explored via Markov-Chain Monte Carlo (MCMC) algorithms. Controlled experiments indicate that the proposed method, while requiring significantly less user intervention, achieves perceptual results similar to those of previous approaches and performs well when dealing with naturally degraded signals.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2005.14181

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > Canada > Quebec > Montreal (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

An Intuitive Tutorial to Gaussian Processes Regression

Wang, Jie

arXiv.org Machine LearningSep-25-2020

This introduction aims to provide readers an intuitive understanding of Gaussian processes regression. Gaussian processes regression (GPR) models have been widely used in machine learning applications because their representation flexibility and inherently uncertainty measures over predictions. The paper starts with explaining mathematical basics that Gaussian processes built on including multivariate normal distribution, kernels, non-parametric models, joint and conditional probability. The Gaussian processes regression is then described in an accessible way by balancing showing unnecessary mathematical derivation steps and missing key conclusive results. An illustrative implementation of a standard Gaussian processes regression algorithm is provided. Beyond the standard Gaussian processes regression, existing software packages to implement state-of-the-art Gaussian processes algorithms are reviewed. Lastly, more advanced Gaussian processes regression models are specified. The paper is written in an accessible way, thus undergraduate science and engineering background will find no difficulties in following the content.

intuitive tutorial, prediction, regression, (12 more...)

arXiv.org Machine Learning

2009.10862

Country:

North America > Canada > Ontario > Kingston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Education > Curriculum > Subject-Specific Education (0.88)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Why have a Unified Predictive Uncertainty? Disentangling it using Deep Split Ensembles

Sarawgi, Utkarsh, Zulfikar, Wazeer, Khincha, Rishab, Maes, Pattie

arXiv.org Machine LearningSep-25-2020

Understanding and quantifying uncertainty in black box Neural Networks (NNs) is critical when deployed in real-world settings such as healthcare. Recent works using Bayesian and non-Bayesian methods have shown how a unified predictive uncertainty can be modelled for NNs. Decomposing this uncertainty to disentangle the granular sources of heteroscedasticity in data provides rich information about its underlying causes. We propose a conceptually simple non-Bayesian approach, deep split ensemble, to disentangle the predictive uncertainties using a multivariate Gaussian mixture model. The NNs are trained with clusters of input features, for uncertainty estimates per cluster. We evaluate our approach on a series of benchmark regression datasets, while also comparing with unified uncertainty methods. Extensive analyses using dataset shits and empirical rule highlight our inherently well-calibrated models. Our work further demonstrates its applicability in a multi-modal setting using a benchmark Alzheimer's dataset and also shows how deep split ensembles can highlight hidden modality-specific biases. The minimal changes required to NNs and the training procedure, and the high flexibility to group features into clusters makes it readily deployable and useful. The source code is available at https://github.com/wazeerzulfikar/deep-split-ensembles

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

2009.12406

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry:

Materials (0.94)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback