Goto

Collaborating Authors

 Bayesian Learning


Bayesian AutoEncoder: Generation of Bayesian Networks with Hidden Nodes for Features

AAAI Conferences

We propose Bayesian AutoEncoder (BAE) in order to construct a recognition system which uses feedback information. BAE constructs a generative model of input data as a Bayes Net. The network trained by BAE obtains its hidden variables as the features of given data. It can execute inference for each variable through belief propagation, using both feedforward and feedback information. We confirmed that BAE can construct small networks with one hidden layer and extract features as hidden variables from 3x3 and 5x5 pixel input data.


Multivariate Conditional Outlier Detection and Its Clinical Application

AAAI Conferences

Over the past decades, the quality of healthcare and its improvement have been the center pieces of many public In the first fold, our key objective is to accurately and efficiently programs and initiatives. Recent studies on patient safety, learn a compact representation of complex clinical however, revealed that preventable medical errors are more records. For clinical data, this is particularly challenging widespread than initially thought, which are now estimated because each record may contain hundreds to thousands to be one of the leading causes of death (James 2013).


Shortest Path Based Decision Making Using Probabilistic Inference

AAAI Conferences

We present a new perspective on the classical shortest path routing (SPR) problem in graphs. We show that the SPR problem can be recast to that of probabilistic inference in a mixture of simple Bayesian networks. Maximizing the likelihood in this mixture becomes equivalent to solving the SPR problem. We develop the well known Expectation-Maximization (EM) algorithm for the SPR problem that maximizes the likelihood, and show that it does not get stuck in a locally optimal solution. Using the same probabilistic framework, we then address an NP-Hard network design problem where the goal is to repair a network of roads post some disaster within a fixed budget such that the connectivity between a set of nodes is optimized. We show that our likelihood maximization approach using the EM algorithm scales well for this problem taking the form of message-passing among nodes of the graph, and provides significantly better quality solutions than a standard mixed-integer programming solver.


Modeling Human Understanding of Complex Intentional Action with a Bayesian Nonparametric Subgoal Model

AAAI Conferences

Most human behaviors consist of multiple parts, steps, or subtasks. These structures guide our ac- tion planning and execution, but when we observe others, the latent structure of their actions is typ- ically unobservable, and must be inferred in order to learn new skills by demonstration, or to as- sist others in completing their tasks. For example, an assistant who has learned the subgoal struc- ture of a colleague’s task can more rapidly rec- ognize and support their actions as they unfold. Here we model how humans infer subgoals from observations of complex action sequences using a nonparametric Bayesian model, which assumes that observed actions are generated by approxi- mately rational planning over unknown subgoal sequences. We test this model with a behavioral experiment in which humans observed different se- ries of goal-directed actions, and inferred both the number and composition of the subgoal sequences associated with each goal. The Bayesian model predicts human subgoal inferences with high ac- curacy, and significantly better than several al- ternative models and straightforward heuristics. Motivated by this result, we simulate how learn- ing and inference of subgoals can improve perfor- mance in an artificial user assistance task. The Bayesian model learns the correct subgoals from fewer observations, and better assists users by more rapidly and accurately inferring the goal of their actions than alternative approaches.


Large Scale Similarity Learning Using Similar Pairs for Person Verification

AAAI Conferences

In this paper, we propose a novel similarity measure and then introduce an efficient strategy to learn it by using only similar pairs for person verification. Unlike existing metric learning methods, we consider both the difference and commonness of an image pair to increase its discriminativeness. Under a pairconstrained Gaussian assumption, we show how to obtain the Gaussian priors (i.e., corresponding covariance matrices) of dissimilar pairs from those of similar pairs. The application of a log likelihood ratio makes the learning process simple and fast and thus scalable to large datasets. Additionally, our method is able to handle heterogeneous data well. Results on the challenging datasets of face verification (LFW and Pub-Fig) and person re-identification (VIPeR) show that our algorithm outperforms the state-of-the-art methods.


Separators and Adjustment Sets in Markov Equivalent DAGs

AAAI Conferences

In practice the vast majority of causal effect estimations from observational data are computed using adjustment sets which avoid confounding by adjusting for appropriate covariates. Recently several graphical criteria for selecting adjustment sets have been proposed. They handle causal directed acyclic graphs (DAGs) as well as more general types of graphs that represent Markov equivalence classes of DAGs, including completed partially directed acyclic graphs (CPDAGs). Though expressed in graphical language, it is not obvious how the criteria can be used to obtain effective algorithms for finding adjustment sets. In this paper we provide a new criterion which leads to an efficient algorithmic framework to find, test and enumerate covariate adjustments for chain graphs - mixed graphs representing in a compact way a broad range of Markov equivalence classes of DAGs.


Learning Ensembles of Cutset Networks

AAAI Conferences

Cutset networks — OR (decision) trees that have Bayesian networks whose treewidth is bounded by one at each leaf — are a new class of tractable probabilistic models that admit fast, polynomial-time inference and learning algorithms. This is unlike other state-of-the-art tractable models such as thin junction trees, arithmetic circuits and sum-product networks in which inference is fast and efficient but learning can be notoriously slow. In this paper, we take advantage of this unique property to develop fast algorithms for learning ensembles of cutset networks. Specifically, we consider generalized additive mixtures of cutset networks and develop sequential boosting-based and parallel bagging-based approaches for learning them from data. We demonstrate, via a thorough experimental evaluation, that our new algorithms are superior to competing approaches in terms of test-set log-likelihood score and learning time.


Learning Bayesian Networks with Bounded Tree-width via Guided Search

AAAI Conferences

Bounding the tree-width of a Bayesian network can reduce the chance of overfitting, and allows exact inference to be performed efficiently. Several existing algorithms tackle the problem of learning bounded tree-width Bayesian networks by learning from k-trees as super-structures, but they do not scale to large domains and/or large tree-width. We propose a guided search algorithm to find k-trees with maximum Informative scores, which is a measure of quality for the k-tree in yielding good Bayesian networks. The algorithm achieves close to optimal performance compared to exact solutions in small domains, and can discover better networks than existing approximate methods can in large domains. It also provides an optimal elimination order of variables that guarantees small complexity for later runs of exact inference. Comparisons with well-known approaches in terms of learning and inference accuracy illustrate its capabilities.


Closed-Form Gibbs Sampling for Graphical Models with Algebraic Constraints

AAAI Conferences

Probabilistic inference in many real-world problems requires graphical models with deterministic algebraic constraints between random variables (e.g., Newtonian mechanics, Pascal’s law, Ohm’s law) that are known to be problematic for many inference methods such as Monte Carlo sampling. Fortunately, when such constraintsare invertible, the model can be collapsed and the constraints eliminated through the well-known Jacobian-based change of variables. As our first contributionin this work, we show that a much broader classof algebraic constraints can be collapsed by leveraging the properties of a Dirac delta model of deterministic constraints. Unfortunately, the collapsing processcan lead to highly piecewise densities that pose challenges for existing probabilistic inference tools. Thus,our second contribution to address these challenges is to present a variation of Gibbs sampling that efficiently samples from these piecewise densities. The key insight to achieve this is to introduce a class of functions that (1) is sufficiently rich to approximate arbitrary models up to arbitrary precision, (2) is closed under dimension reduction (collapsing) for models with (non)linear algebraic constraints and (3) always permits one analytical integral sufficient to automatically derive closed-form conditionals for Gibbs sampling. Experiments demonstrate the proposed sampler converges at least an order of magnitude faster than existing Monte Carlo samplers.


On Learning Causal Models from Relational Data

AAAI Conferences

Many applications call for learning causal models from relational data. We investigate Relational Causal Models (RCM) under relational counterparts of adjacency-faithfulness and orientation-faithfulness, yielding a simple approach to identifying a subset of relational d-separation queries needed for determining the structure of an RCM using d-separation against an unrolled DAG representation of the RCM. We provide original theoretical analysis that offers the basis of a sound and efficient algorithm for learning the structure of an RCM from relational data. We describe RCD-Light, a sound and efficient constraint-based algorithm that is guaranteed to yield a correct partially-directed RCM structure with at least as many edges oriented as in that produced by RCD, the only other existing algorithm for learning RCM. We show that unlike RCD, which requires exponential time and space, RCD-Light requires only polynomial time and space to orient the dependencies of a sparse RCM.