AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

A Primer on Domain Adaptation

Lemberger, Pirmin, Panico, Ivan

arXiv.org Machine LearningJan-27-2020

Standard supervised machine learning assumes that the distribution of the source samples used to train an algorithm is the same as the one of the target samples on which it is supposed to make predictions. However, as any data scientist will confirm, this is hardly ever the case in practice. The set of statistical and numerical methods that deal with such situations is known as domain adaptation, a field with a long and rich history. The myriad of methods available and the unfortunate lack of a clear and universally accepted terminology can however make the topic rather daunting for the newcomer. Therefore, rather than aiming at completeness, which leads to exhibiting a tedious catalog of methods, this pedagogical review aims at a coherent presentation of four important special cases: (1) \emph{prior shift}, a situation in which training samples were selected according to their labels without any knowledge of their actual distribution in the target, (2) \emph{covariate shift} which deals with a situation where training examples were picked according to their features but with some selection bias, (3) \emph{concept shift} where the dependence of the labels on the features defers between the source and the target, and last but not least (4) \emph{subspace mapping} which deals with a situation where features in the target have been subjected to an unknown distortion with respect to the source features. In each case we first build an intuition, next we provide the appropriate mathematical framework and eventually we describe a practical application.

algorithm, application, target domain, (14 more...)

arXiv.org Machine Learning

2001.09994

Country:

North America > United States > Wisconsin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(3 more...)

Add feedback

Bayesian nonparametric shared multi-sequence time series segmentation

Mikheeva, Olga, Kazlauskaite, Ieva, Kjellström, Hedvig, Ek, Carl Henrik

arXiv.org Machine LearningJan-27-2020

In this paper, we introduce a method for segmenting time series data using tools from Bayesian nonparametrics. We consider the task of temporal segmentation of a set of time series data into representative stationary segments. We use Gaussian process (GP) priors to impose our knowledge about the characteristics of the underlying stationary segments, and use a nonparametric distribution to partition the sequences into such segments, formulated in terms of a prior distribution on segment length. Given the segmentation, the model can be viewed as a variant of a Gaussian mixture model where the mixture components are described using the covariance function of a GP. We demonstrate the effectiveness of our model on synthetic data as well as on real time-series data of heartbeats where the task is to segment the indicative types of beats and to classify the heartbeat recordings into classes that correspond to healthy and abnormal heart sounds.

probability, segmentation, sequence, (16 more...)

arXiv.org Machine Learning

2001.09886

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > United Kingdom > England > Somerset > Bath (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Feature selection in machine learning: R\'enyi min-entropy vs Shannon entropy

Palamidessi, Catuscia, Romanelli, Marco

arXiv.org Machine LearningJan-27-2020

Feature selection, in the context of machine learning, is the process of separating the highly predictive feature from those that might be irrelevant or redundant. Information theory has been recognized as a useful concept for this task, as the prediction power stems from the correlation, i.e., the mutual information, between features and labels. Many algorithms for feature selection in the literature have adopted the Shannon-entropy-based mutual information. In this paper, we explore the possibility of using R enyi min-entropy instead. In particular, we propose an algorithm based on a notion of conditional R enyi min-entropy that has been recently adopted in the field of security and privacy, and which is strictly related to the Bayes error. We prove that in general the two approaches are incomparable, in the sense that we show that we can construct datasets on which the R enyi-based algorithm performs better than the corresponding Shannon-based one, and datasets on which the situation is reversed. In practice, however, when considering datasets of real data, it seems that the R enyi-based algorithm tends to outperform the other one. We have effectuate several experiments on the BASE-HOCK, SEMEION, and GISETTE datasets, and in all of them we have indeed observed that the R enyi-based algorithm gives better results.

entropy, feature selection, selection, (15 more...)

arXiv.org Machine Learning

2001.09654

Country:

Europe > Italy (0.04)
Europe > France (0.04)

Genre:

Workflow (0.46)
Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Heterogeneous Learning from Demonstration

Paleja, Rohan, Gombolay, Matthew

arXiv.org Machine LearningJan-26-2020

--The development of human-robot systems able to leverage the strengths of both humans and their robotic counterparts has been greatly sought after because of the foreseen, broad-ranging impact across industry and research. We believe the true potential of these systems cannot be reached unless the robot is able to act with a high level of autonomy, reducing the burden of manual tasking or teleoperation. T o achieve this level of autonomy, robots must be able to work fluidly with its human partners, inferring their needs without explicit commands. This inference requires the robot to be able to detect and classify the heterogeneity of its partners. We propose a framework for learning from heterogeneous demonstration based upon Bayesian inference and evaluate a suite of approaches on a real-world dataset of gameplay from StarCraft II. This evaluation provides evidence that our Bayesian approach can outperform conventional methods by up to 12.8 % . 1 Index T erms--Learning from Demonstration; Human-Robot Interaction; Human-Robot T eaming; Deep Learning I.

learning, robot, surgeon, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/HRI.2019.8673267

2001.09569

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.36)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)

Add feedback

Estimating Aggregate Properties In Relational Networks With Unobserved Data

Embar, Varun, Srinivasan, Sriram, Getoor, Lise

arXiv.org Machine LearningJan-26-2020

Aggregate network properties such as cluster cohesion and the number of bridge nodes can be used to glean insights about a network's community structure, spread of influence and the resilience of the network to faults. Efficiently computing network properties when the network is fully observed has received significant attention (Wasserman and Faust 1994; Cook and Holder 2006), however the problem of computing aggregate network properties when there is missing data attributes has received little attention. Computing these properties for networks with missing attributes involves performing inference over the network. Statistical relational learning (SRL) and graph neural networks (GNNs) are two classes of machine learning approaches well suited for inferring missing attributes in a graph. In this paper, we study the effectiveness of these approaches in estimating aggregate properties on networks with missing attributes. We compare two SRL approaches and three GNNs. For these approaches we estimate these properties using point estimates such as MAP and mean. For SRL-based approaches that can infer a joint distribution over the missing attributes, we also estimate these properties as an expectation over the distribution. To compute the expectation tractably for probabilistic soft logic, one of the SRL approaches that we study, we introduce a novel sampling framework. In the experimental evaluation, using three benchmark datasets, we show that SRL-based approaches tend to outperform GNN-based approaches both in computing aggregate properties and predictive accuracy. Specifically, we show that estimating the aggregate properties as an expectation over the joint distribution outperforms point estimates.

neural network, node, representation, (17 more...)

arXiv.org Machine Learning

2001.05617

Country:

North America > United States (0.14)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Particle-Gibbs Sampling For Bayesian Feature Allocation Models

Bouchard-Côté, Alexandre, Roth, Andrew

arXiv.org Machine LearningJan-25-2020

Bayesian feature allocation models are a popular tool for modelling data with a combinatorial latent structure. Exact inference in these models is generally intractable and so practitioners typically apply Markov Chain Monte Carlo (MCMC) methods for posterior inference. The most widely used MCMC strategies rely on an element wise Gibbs update of the feature allocation matrix. These element wise updates can be inefficient as features are typically strongly correlated. To overcome this problem we have developed a Gibbs sampler that can update an entire row of the feature allocation matrix in a single move. However, this sampler is impractical for models with a large number of features as the computational complexity scales exponentially in the number of features. We develop a Particle Gibbs sampler that targets the same distribution as the row wise Gibbs updates, but has computational complexity that only grows linearly in the number of features. We compare the performance of our proposed methods to the standard Gibbs sampler using synthetic data from a range of feature allocation models. Our results suggest that row wise updates using the PG methodology can significantly improve the performance of samplers for feature allocation models.

algorithm, particle, sampler, (16 more...)

arXiv.org Machine Learning

2001.09367

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Sparse Semi-supervised Heterogeneous Interbattery Bayesian Analysis

Sevilla-Salcedo, Carlos, Gómez-Verdejo, Vanessa, Olmos, Pablo M.

arXiv.org Machine LearningJan-24-2020

The Bayesian approach to feature extraction, known as factor analysis (FA), has been widely studied in machine learning to obtain a latent representation of the data. An adequate selection of the probabilities and priors of these bayesian models allows the model to better adapt to the data nature (i.e. heterogeneity, sparsity), obtaining a more representative latent space. The objective of this article is to propose a general FA framework capable of modelling any problem. To do so, we start from the Bayesian Inter-Battery Factor Analysis (BIBFA) model, enhancing it with new functionalities to be able to work with heterogeneous data, include feature selection, and handle missing values as well as semi-supervised problems. The performance of the proposed model, Sparse Semi-supervised Heterogeneous Interbattery Bayesian Analysis (SSHIBA) has been tested on 4 different scenarios to evaluate each one of its novelties, showing not only a great versatility and an interpretability gain, but also outperforming most of the state-of-the-art algorithms.

algorithm, database, sshiba, (16 more...)

arXiv.org Machine Learning

2001.08975

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Europe > United Kingdom > Scotland (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (0.67)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Deep Bayesian Network for Visual Question Generation

Patro, Badri N., Kurmi, Vinod K., Kumar, Sandeep, Namboodiri, Vinay P.

arXiv.org Artificial IntelligenceJan-23-2020

Generating natural questions from an image is a semantic task that requires using vision and language modalities to learn multimodal representations. Images can have multiple visual and language cues such as places, captions, and tags. In this paper, we propose a principled deep Bayesian learning framework that combines these cues to produce natural questions. We observe that with the addition of more cues and by minimizing uncertainty in the among cues, the Bayesian network becomes more confident. We propose a Minimizing Uncertainty of Mixture of Cues (MUMC), that minimizes uncertainty present in a mixture of cues experts for generating probabilistic questions. This is a Bayesian framework and the results show a remarkable similarity to natural questions as validated by a human study. We observe that with the addition of more cues and by minimizing uncertainty among the cues, the Bayesian framework becomes more confident. Ablation studies of our model indicate that a subset of cues is inferior at this task and hence the principled fusion of cues is preferred. Further, we observe that the proposed approach substantially improves over state-of-the-art benchmarks on the quantitative metrics (BLEU-n, METEOR, ROUGE, and CIDEr). Here we provide project link for Deep Bayesian VQG \url{https://delta-lab-iitk.github.io/BVQG/}

computer vision, natural question, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2001.08779

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Learning Distributional Programs for Relational Autocompletion

Nitesh, Kumar, Ondrej, Kuzelka, Luc, De Raedt

arXiv.org Artificial IntelligenceJan-23-2020

Relational autocompletion is the problem of automatically filling out some missing fields in a relational database. We tackle this problem within the probabilistic logic programming framework of Distributional Clauses (DC), which supports both discrete and continuous probability distributions. Within this framework, we introduce Dreaml -- an approach to learn both the structure and the parameters of DC programs from databases that may contain missing information. To realize this, Dreaml integrates statistical modeling, distributional clauses with rule learning. The distinguishing features of Dreaml are that it 1) tackles relational autocompletion, 2) learns distributional clauses extended with statistical models, 3) deals with both discrete and continuous distributions, 4) can exploit background knowledge, and 5) uses an expectation-maximization based algorithm to cope with missing data.

background knowledge, database, distributional clause, (15 more...)

arXiv.org Artificial Intelligence

2001.08603

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > California > Alameda County > Livermore (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

Community Detection in Bipartite Networks with Stochastic Blockmodels

Yen, Tzu-Chi, Larremore, Daniel B.

arXiv.org Machine LearningJan-22-2020

In bipartite networks, community structures are restricted to being disassortative, in that nodes of one type are grouped according to common patterns of connection with nodes of the other type. This makes the stochastic block model (SBM), a highly flexible generative model for networks with block structure, an intuitive choice for bipartite community detection. However, typical formulations of the SBM do not make use of the special structure of bipartite networks. In this work, we introduce a Bayesian nonparametric formulation of the SBM and a corresponding algorithm to efficiently find communities in bipartite networks without overfitting. The biSBM improves community detection results over general SBMs when data are noisy, improves the model resolution limit by a factor of $\sqrt{2}$, and expands our understanding of the complicated optimization landscape associated with community detection tasks. A direct comparison of certain terms of the prior distributions in the biSBM and a related high-resolution hierarchical SBM also reveals a counterintuitive regime of community detection problems, populated by smaller and sparser networks, where non-hierarchical models outperform their more flexible counterpart.

algorithm, node, partition, (15 more...)

arXiv.org Machine Learning

2001.11818

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
(2 more...)

Add feedback