Goto

Collaborating Authors

 Uncertainty


Annealed Importance Sampling for Structure Learning in Bayesian Networks

AAAI Conferences

We present a new sampling approach to Bayesian learning of the Bayesian network structure. Like some earlier sampling methods, we sample linear orders on nodes rather than directed acyclic graphs (DAGs). The key difference is that we replace the usual Markov chain Monte Carlo (MCMC) method by the method of annealed importance sampling (AIS). We show that AIS is not only competitive to MCMC in exploring the posterior, but also superior to MCMC in two ways: it enables easy and efficient parallelization, due to the independence of the samples, and lower-bounding of the marginal likelihood of the model with good probabilistic guarantees. We also provide a principled way to correct the bias due to order-based sampling, by implementing a fast algorithm for counting the linear extensions of a given partial order.


A Bayesian Factorised Covariance Model for Image Analysis

AAAI Conferences

This paper presents a specialised Bayesian model for analysing the covariance of data that are observed in the form of matrices, which is particularly suitable for images. Compared to existing general-purpose covariance learning techniques, we exploit the fact that the variables are organised as an array with two sets of ordered indexes, which induces innate relationship between the variables. Specifically, we adopt a factorised structure for the covariance matrix. The covariance of two variables is represented by the product of the covariance of the two corresponding rows and that of the two columns. The factors, i.e. the row-wise and column-wise covariance matrices are estimated by Bayesian inference with sparse priors. Empirical study has been conducted on image analysis. The model first learns correlations between the rows and columns in an image plane. Then the correlations between individual pixels can be inferred by their locations. This scheme utilises the structural information of an image, and benefits the analysis when the data are damaged or insufficient.


Adaptive Thresholding in Structure Learning of a Bayesian Network

AAAI Conferences

Thresholding a measure in conditional independence (CI) tests using a fixed value enables learning and removing edges as part of learning a Bayesian network structure. However, the learned structure is sensitive to the threshold that is commonly selected: 1) arbitrarily; 2) irrespective of characteristics of the domain; and 3) fixed for all CI tests. We analyze the impact on mutual information – a CI measure – of factors, such as sample size, degree of variable dependence, and variables’ cardinalities. Following, we suggest to adaptively threshold individual tests based on the factors. We show that adaptive thresholds better distinguish between pairs of dependent variables and pairs of independent variables and enable learning structures more accurately and quickly than when using fixed thresholds.


Generalized Relational Topic Models with Data Augmentation

AAAI Conferences

Relational topic models have shown promise on analyzing document network structures and discovering latent topic representations. This paper presents three extensions: 1) unlike the common link likelihood with a diagonal weight matrix that allows the-same-topic interactions only, we generalize it to use a full weight matrix that captures all pairwise topic interactions and is applicable to asymmetric networks; 2) instead of doing standard Bayesian inference, we perform regularized Bayesian inference with a regularization parameter to deal with the imbalanced link structure issue in common real networks; and 3) instead of doing variational approximation with strict mean-field assumptions, we present a collapsed Gibbs sampling algorithm for the generalized relational topic models without making restricting assumptions. Experimental results demonstrate the significance of these extensions on improving the prediction performance, and the time efficiency can be dramatically improved with a simple fast approximation method.


An Ensemble of Bayesian Networks for Multilabel Classification

AAAI Conferences

We present a novel approach for multilabel classification based on an ensemble of Bayesian networks. The class variables are connected by a tree; each model of the ensemble uses a different class as root of the tree. We assume the features to be conditionally independent given the classes, thus generalizing the naive Bayes assumption to the multiclass case. This assumption allows us to optimally identify the correlations between classes and features; such correlations are moreover shared across all models of the ensemble. Inferences are drawn from the ensemble via logarithmic opinion pooling. To minimize Hamming loss, we compute the marginal probability of the classes by running standard inference on each Bayesian network in the ensemble, and then pooling the inferences. To instead minimize the subset 0/1 loss, we pool the joint distributions of each model and cast the problem as a MAP inference in the corresponding graphical model. Experiments show that the approach is competitive with state-of-the-art methods for multilabel classification.


An Ambiguity Aversion Framework of Security Games under Ambiguities

AAAI Conferences

Security is a critical concern around the world. Since resources for security are always limited, lots of interest have arisen in using game theory to handle security resource allocation problems. However, most of the existing work does not address adequately how a defender chooses his optimal strategy in a game with absent, inaccurate, uncertain, and even ambiguous strategy profiles' payoffs. To address this issue, we propose a general framework of security games under ambiguities based on Dempster-Shafer theory and the ambiguity aversion principle of minimax regret. Then, we reveal some properties of this framework. Also, we present two methods to reduce the influence of complete ignorance. Our investigation shows that this new framework is better in handling security resource allocation problems under ambiguities.


An efficient model-free estimation of multiclass conditional probability

arXiv.org Machine Learning

Conventional multiclass conditional probability estimation methods, such as Fisher's discriminate analysis and logistic regression, often require restrictive distributional model assumption. In this paper, a model-free estimation method is proposed to estimate multiclass conditional probability through a series of conditional quantile regression functions. Specifically, the conditional class probability is formulated as difference of corresponding cumulative distribution functions, where the cumulative distribution functions can be converted from the estimated conditional quantile regression functions. The proposed estimation method is also efficient as its computation cost does not increase exponentially with the number of classes. The theoretical and numerical studies demonstrate that the proposed estimation method is highly competitive against the existing competitors, especially when the number of classes is relatively large.


Scoring and Searching over Bayesian Networks with Causal and Associative Priors

arXiv.org Artificial Intelligence

A significant theoretical advantage of search-and-score methods for learning Bayesian Networks is that they can accept informative prior beliefs for each possible network, thus complementing the data. In this paper, a method is presented for assigning priors based on beliefs on the presence or absence of certain paths in the true network. Such beliefs correspond to knowledge about the possible causal and associative relations between pairs of variables. This type of knowledge naturally arises from prior experimental and observational data, among others. In addition, a novel search-operator is proposed to take advantage of such prior knowledge. Experiments show that, using path beliefs improves the learning of the skeleton, as well as the edge directions in the network.


Likelihood-ratio calibration using prior-weighted proper scoring rules

arXiv.org Machine Learning

Prior-weighted logistic regression has become a standard tool for calibration in speaker recognition. Logistic regression is the optimization of the expected value of the logarithmic scoring rule. We generalize this via a parametric family of proper scoring rules. Our theoretical analysis shows how different members of this family induce different relative weightings over a spectrum of applications of which the decision thresholds range from low to high. Special attention is given to the interaction between prior weighting and proper scoring rule parameters. Experiments on NIST SRE'12 suggest that for applications with low false-alarm rate requirements, scoring rules tailored to emphasize higher score thresholds may give better accuracy than logistic regression.


Counterfactual Reasoning and Learning Systems

arXiv.org Artificial Intelligence

This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select changes that improve both the short-term and long-term performance of such systems. This work is illustrated by experiments carried out on the ad placement system associated with the Bing search engine.