Bayesian Learning
Learning to Reject Sequential Importance Steps for Continuous-Time Bayesian Networks
Weiss, Jeremy C. (University of Wisconsin-Madison) | Natarajan, Sriraam (Indiana University) | Page, C. David (University of Wisconsin-Madison)
Applications of graphical models often require the use of approximate inference, such as sequential importance sampling (SIS), for estimation of the model distribution given partial evidence, i.e., the target distribution. However, when SIS proposal and target distributions are dissimilar, such procedures lead to biased estimates or require a prohibitive number of samples. We introduce ReBaSIS, a method that better approximates the target distribution by sampling variable by variable from existing importance samplers and accepting or rejecting each proposed assignment in the sequence: a choice made based on anticipating upcoming evidence. We relate the per-variable proposal and model distributions by expected weight ratios of sequence completions and show that we can learn accurate models of optimal acceptance probabilities from local samples. In a continuous-time domain, our method improves upon previous importance samplers by transforming an SIS problem into a machine learning one.
Loss-Calibrated Monte Carlo Action Selection
Abbasnejad, Ehsan (Australian National University and NICTA) | Domke, Justin (Australian National University and NICTA) | Sanner, Scott (Australian National University and NICTA)
Bayesian decision-theory underpins robust decision-making in applications ranging from plant control to robotics where hedging action selection against state uncertainty is critical for minimizing low probability but potentially catastrophic outcomes (e.g, uncontrollable plant conditions or robots falling into stairwells). Unfortunately, belief state distributions in such settings are often complex and/or high dimensional, thus prohibiting the efficient application of analytical techniques for expected utility computation when real-time control is required. This leaves Monte Carlo evaluation as one of the few viable (and hence frequently used) techniques for online action selection. However, loss-insensitive Monte Carlo methods may require large numbers of samples to identify optimal actions with high certainty since they may sample from highprobability regions that do not disambiguate action utilities. In this paper we remedy this problem by deriving an optimal proposal distribution for a loss-calibrated Monte Carlo importance sampler that bounds the regret of using an estimated optimal action. Empirically, we show that using our loss-calibrated Monte Carlo method yields high-accuracy optimal action selections in a fraction of the number of samples required by conventional loss-insensitive samplers.
Bayesian Networks Specified Using Propositional and Relational Constructs: Combined, Data, and Domain Complexity
Cozman, Fabio Gagliardi (Universidade de Sao Paulo) | Maua, Denis Deratani (Universidade de Sao Paulo)
We examine the inferential complexity of Bayesian networks specified through logical constructs. We first consider simple propositional languages, and then move to relational languages. We examine both the combined complexity of inference (as network size and evidence size are not bounded) and the data complexity of inference (where network size is bounded); we also examine the connection to liftability through domain complexity. Combined and data complexity of several inference problems are presented, ranging from polynomial to exponential classes.
Bayesian Model Averaging Naive Bayes (BMA-NB): Averaging over an Exponential Number of Feature Models in Linear Time
Wu, Ga (Australian National University) | Sanner, Scott (NICTA and Australian National University) | Oliveira, Rodrigo F.S.C. (University of Pernambuco)
Naive Bayes (NB) is well-known to be a simple but effective classifier, especially when combined with feature selection. Unfortunately, feature selection methods are often greedy and thus cannot guarantee an optimal feature set is selected. An alternative to feature selection is to use Bayesian model averaging (BMA), which computes a weighted average over multiple predictors; when the different predictor models correspond to different feature sets, BMA has the advantage over feature selection that its predictions tend to have lower variance on average in comparison to any single model. In this paper, we show for the first time that it is possible to exactly evaluate BMA over the exponentially-sized powerset of NB feature models in linear-time in the number of features; this yields an algorithm about as expensive to train as a single NB model with all features, but yet provably converges to the globally optimal feature subset in the asymptotic limit of data. We evaluate this novel BMA-NB classifier on a range of datasets showing that it never underperforms NB (as expected) and sometimes offers performance competitive (or superior) to classifiers such as SVMs and logistic regression while taking a fraction of the time to train.
Obtaining Well Calibrated Probabilities Using Bayesian Binning
Naeini, Mahdi Pakdaman (University of Pittsburgh) | Cooper, Gregory (University of Pittsburgh) | Hauskrecht, Milos (University of Pittsburgh)
However, model calibration and the learning is critical for many prediction and decision-making of well-calibrated probabilistic models have not been tasks in artificial intelligence. In this paper we present a new studied in the machine learning literature as extensively as nonparametric calibration method called Bayesian Binning for example discriminative machine learning models that into Quantiles (BBQ) which addresses key limitations of existing are built to achieve the best possible discrimination among calibration methods. The method post processes the classes of objects. One way to achieve a high level of model output of a binary classification algorithm; thus, it can be calibration is to develop methods for learning probabilistic readily combined with many existing classification algorithms.
A Bayesian Approach to Perceptual 3D Object-Part Decomposition Using Skeleton-Based Representations
El-Gaaly, Tarek (Rutgers University) | Froyen, Vicky (Rutgers University) | Elgammal, Ahmed (Rutgers University) | Feldman, Jacob (Rutgers University) | Singh, Manish (Rutgers University)
We present a probabilistic approach to shape decomposition that creates a skeleton-based shape representation of a 3D object while simultaneously decomposing it into constituent parts. Our approach probabilistically combines two prominent threads from the shape literature: skeleton-based (medial axis) representations of shape, and part-based representations of shape, in which shapes are combinations of primitive parts. Our approach recasts skeleton-based shape representation as a mixture estimation problem, allowing us to apply probabilistic estimation techniques to the problem of 3D shape decomposition, extending earlier work on the 2D case. The estimated 3D shape decompositions approximate human shape decomposition judgments. We present a tractable implementation of the framework, which begins by over-segmenting objects at concavities, and then probabilistically merges them to create a distribution over possible decompositions. This results in a hierarchy of decompositions at different structural scales, again closely matching known properties of human shape representation. The probabilistic estimation procedures that arise naturally in the model allow effective prediction of missing parts. We present results on shapes from a standard database illustrating the effectiveness of the approach.
An Exact Algorithm for Solving Most Relevant Explanation in Bayesian Networks
Zhu, Xiaoyuan (Queens College, City University of New York) | Yuan, Changhe (Queens College, City University of New York)
Most Relevant Explanation (MRE) is a new inference task in Bayesian networks that finds the most relevant partial instantiation of target variables as an explanation for given evidence by maximizing the Generalized Bayes Factor (GBF). No exact algorithm has been developed for solving MRE previously. This paper fills the void and introduces a breadth-first branch-and-bound MRE algorithm based on a novel upper bound on GBF. The bound is calculated by decomposing the computation of the score to a set of Markov blankets of subsets of evidence variables. Our empirical evaluations show that the proposed algorithm scales up exact MRE inference significantly.
Hierarchical Monte-Carlo Planning
Vien, Ngo Anh (University of Stuttgart) | Toussaint, Marc (University of Stuttgart)
Monte-Carlo Tree Search, especially UCT and its POMDP version POMCP, have demonstrated excellent performanceon many problems. However, to efficiently scale to large domains one should also exploit hierarchical structure if present. In such hierarchical domains, finding rewarded states typically requires to search deeply; covering enough such informative states very far from the root becomes computationally expensive in flat non-hierarchical search approaches. We propose novel, scalable MCTS methods which integrate atask hierarchy into the MCTS framework, specifically lead-ing to hierarchical versions of both, UCT and POMCP. The new method does not need to estimate probabilistic models of each subtask, it instead computes subtask policies purely sample-based. We evaluate the hierarchical MCTS methods on various settings such as a hierarchical MDP, a Bayesian model-based hierarchical RL problem, and a large hierarchical POMDP.
Just Count the Satisfied Groundings: Scalable Local-Search and Sampling Based Inference in MLNs
Venugopal, Deepak (The University of Texas at Dallas) | Sarkhel, Somdeb (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)
The main computational bottleneck in various sampling based and local-search based inference algorithms for Markov logic networks (e.g., Gibbs sampling, MC-SAT, MaxWalksat, etc.) is computing the number of groundings of a first-order formula that are true given a truth assignment to all of its ground atoms. We reduce this problem to the problem of counting the number of solutions of a constraint satisfaction problem (CSP) and show that during their execution, both sampling based and local-search based algorithms repeatedly solve dynamic versions of this counting problem. Deriving from the vast amount of literature on CSPs and graphical models, we propose an exact junction-tree based algorithm for computing the number of solutions of the dynamic CSP, analyze its properties, and show how it can be used to improve the computational complexity of Gibbs sampling and MaxWalksat. Empirical tests on a variety of benchmarks clearly show that our new approach is several orders of magnitude more scalable than existing approaches.
Tighter Value Function Bounds for Bayesian Reinforcement Learning
Lee, Kanghoon (KAIST) | Kim, Kee-Eung (KAIST)
Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on model based BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to compute tighter value function bounds of the Bayes-optimal value function, which is crucial for improving the performance of many model-based BRL algorithms. We then present how our bounds can be integrated into real-time AO* heuristic search, and provide a theoretical analysis on the impact of improved bounds on the search efficiency. We also provide empirical results on standard BRL domains that demonstrate the effectiveness of our approach.