Goto

Collaborating Authors

 Uncertainty


A Probabilistic Algorithm for Calculating Structure: Borrowing from Simulated Annealing

arXiv.org Artificial Intelligence

We have developed a general Bayesian algorithm for determining the coordinates of points in a three-dimensional space. The algorithm takes as input a set of probabilistic constraints on the coordinates of the points, and an a priori distribution for each point location. The output is a maximum-likelihood estimate of the location of each point. We use the extended, iterated Kalman filter, and add a search heuristic for optimizing its solution under nonlinear conditions. This heuristic is based on the same principle as the simulated annealing heuristic for other optimization problems. Our method uses any probabilistic constraints that can be expressed as a function of the point coordinates (for example, distance, angles, dihedral angles, and planarity). It assumes that all constraints have Gaussian noise. In this paper, we describe the algorithm and show its performance on a set of synthetic data to illustrate its convergence properties, and its applicability to domains such ng molecular structure determination.


From Conditional Oughts to Qualitative Decision Theory

arXiv.org Artificial Intelligence

The primary theme of this investigation is a decision theoretic account of conditional ought statements (e.g., "You ought to do A, if C") that rectifies glaring deficiencies in classical deontic logic. The resulting account forms a sound basis for qualitative decision theory, thus providing a framework for qualitative planning under uncertainty. In particular, we show that adding causal relationships (in the form of a single graph) as part of an epistemic state is sufficient to facilitate the analysis of action sequences, their consequences, their interaction with observations, their expected utilities and, hence, the synthesis of plans and strategies under uncertainty.


Causality in Bayesian Belief Networks

arXiv.org Artificial Intelligence

We address the problem of causal interpretation of the graphical structure of Bayesian belief networks (BBNs). We review the concept of causality explicated in the domain of structural equations models and show that it is applicable to BBNs. In this view, which we call mechanism-based, causality is defined within models and causal asymmetries arise when mechanisms are placed in the context of a system. We lay the link between structural equations models and BBNs models and formulate the conditions under which the latter can be given causal interpretation.


Impulsive Noise Mitigation in Powerline Communications Using Sparse Bayesian Learning

arXiv.org Machine Learning

Additive asynchronous and cyclostationary impulsive noise limits communication performance in OFDM powerline communication (PLC) systems. Conventional OFDM receivers assume additive white Gaussian noise and hence experience degradation in communication performance in impulsive noise. Alternate designs assume a parametric statistical model of impulsive noise and use the model parameters in mitigating impulsive noise. These receivers require overhead in training and parameter estimation, and degrade due to model and parameter mismatch, especially in highly dynamic environments. In this paper, we model impulsive noise as a sparse vector in the time domain without any other assumptions, and apply sparse Bayesian learning methods for estimation and mitigation without training. We propose three iterative algorithms with different complexity vs. performance trade-offs: (1) we utilize the noise projection onto null and pilot tones to estimate and subtract the noise impulses; (2) we add the information in the data tones to perform joint noise estimation and OFDM detection; (3) we embed our algorithm into a decision feedback structure to further enhance the performance of coded systems. When compared to conventional OFDM PLC receivers, the proposed receivers achieve SNR gains of up to 9 dB in coded and 10 dB in uncoded systems in the presence of impulsive noise.


Continuous-time Infinite Dynamic Topic Models

arXiv.org Machine Learning

Topic models are probabilistic models for discovering topical themes in collections of documents. In real world applications, these models provide us with the means of organizing what would otherwise be unstructured collections. They can help us cluster a huge collection into different topics or find a subset of the collection that resembles the topical theme found in an article at hand. The first wave of topic models developed were able to discover the prevailing topics in a big collection of documents spanning a period of time. It was later realized that these time-invariant models were not capable of modeling 1) the time varying number of topics they discover and 2) the time changing structure of these topics. Few models were developed to address this two deficiencies. The online-hierarchical Dirichlet process models the documents with a time varying number of topics. It varies the structure of the topics over time as well. However, it relies on document order, not timestamps to evolve the model over time. The continuous-time dynamic topic model evolves topic structure in continuous-time. However, it uses a fixed number of topics over time. In this dissertation, I present a model, the continuous-time infinite dynamic topic model, that combines the advantages of these two models 1) the online-hierarchical Dirichlet process, and 2) the continuous-time dynamic topic model. More specifically, the model I present is a probabilistic topic model that does the following: 1) it changes the number of topics over continuous time, and 2) it changes the topic structure over continuous-time. I compared the model I developed with the two other models with different setting values. The results obtained were favorable to my model and showed the need for having a model that has a continuous-time varying number of topics and topic structure.


Bayesian Consensus Clustering

arXiv.org Machine Learning

The task of clustering a set of objects based on multiple sources of data arises in several modern applications. We propose an integrative statistical model that permits a separate clustering of the objects for each data source. These separate clusterings adhere loosely to an overall consensus clustering, and hence they are not independent. We describe a computationally scalable Bayesian framework for simultaneous estimation of both the consensus clustering and the source-specific clusterings. We demonstrate that this flexible approach is more robust than joint clustering of all data sources, and is more powerful than clustering each data source separately. This work is motivated by the integrated analysis of heterogeneous biomedical data, and we present an application to subtype identification of breast cancer tumor samples using publicly available data from The Cancer Genome Atlas. Several fields of research now analyze multi-source data (also called multimodal data), in which multiple heterogeneous datasets describe a common set of objects.


A probabilistic methodology for multilabel classification

arXiv.org Artificial Intelligence

Multilabel classification is a relatively recent subfield of machine learning. Unlike to the classical approach, where instances are labeled with only one category, in multilabel classification, an arbitrary number of categories is chosen to label an instance. Due to the problem complexity (the solution is one among an exponential number of alternatives), a very common solution (the binary method) is frequently used, learning a binary classifier for every category, and combining them all afterwards. The assumption taken in this solution is not realistic, and in this work we give examples where the decisions for all the labels are not taken independently, and thus, a supervised approach should learn those existing relationships among categories to make a better classification. Therefore, we show here a generic methodology that can improve the results obtained by a set of independent probabilistic binary classifiers, by using a combination procedure with a classifier trained on the co-occurrences of the labels. We show an exhaustive experimentation in three different standard corpora of labeled documents (Reuters-21578, Ohsumed-23 and RCV1), which present noticeable improvements in all of them, when using our methodology, in three probabilistic base classifiers.


Learning Gaussian Networks

arXiv.org Machine Learning

We describe algorithms for learning Bayesian networks from a combination of user knowledge and statistical data. The algorithms have two components: a scoring metric and a search procedure. The scoring metric takes a network structure, statistical data, and a user's prior knowledge, and returns a score proportional to the posterior probability of the network structure given the data. The search procedure generates networks for evaluation by the scoring metric. Previous work has concentrated on metrics for domains containing only discrete variables, under the assumption that data represents a multinomial sample. In this paper, we extend this work, developing scoring metrics for domains containing all continuous variables or a mixture of discrete and continuous variables, under the assumption that continuous data is sampled from a multivariate normal distribution. Our work extends traditional statistical approaches for identifying vanishing regression coefficients in that we identify two important assumptions, called event equivalence and parameter modularity, that when combined allow the construction of prior distributions for multivariate normal parameters from a single prior Bayesian network specified by a user.


Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization

arXiv.org Machine Learning

Integration is affected by the curse of dimensionality and quickly becomes intractable as the dimensionality of the problem grows. We propose a randomized algorithm that, with high probability, gives a constant-factor approximation of a general discrete integral defined over an exponentially large set. This algorithm relies on solving only a small number of instances of a discrete combinatorial optimization problem subject to randomly generated parity constraints used as a hash function. As an application, we demonstrate that with a small number of MAP queries we can efficiently approximate the partition function of discrete graphical models, which can in turn be used, for instance, for marginal computation or model selection.


Three Approaches to Probability Model Selection

arXiv.org Artificial Intelligence

This paper compares three approaches to the problem of selecting among probability models to fit data (1) use of statistical criteria such as Akaike's information criterion and Schwarz's "Bayesian information criterion," (2) maximization of the posterior probability of the model, and (3) maximization of an effectiveness ratio? trading off accuracy and computational cost. The unifying characteristic of the approaches is that all can be viewed as maximizing a penalized likelihood function. The second approach with suitable prior distributions has been shown to reduce to the first. This paper shows that the third approach reduces to the second for a particular form of the effectiveness ratio, and illustrates all three approaches with the problem of selecting the number of components in a mixture of Gaussian distributions. Unlike the first two approaches, the third can be used even when the candidate models are chosen for computational efficiency, without regard to physical interpretation, so that the likelihood and the prior distribution over models cannot be interpreted literally. As the most general and computationally oriented of the approaches, it is especially useful for artificial intelligence applications.