Goto

Collaborating Authors

 Directed Networks


Context models on sequences of covers

arXiv.org Machine Learning

Conditional measure estimation is a fundamental problem in statistics. Specific instances of this problem include classification, regression and conditional density estimation. This paper formulates a general approach for nonparametric, incremental, closed-form Bayesian estimation of conditional measures that relies on a model structure defined on a sequence of covers. This is an important development, particularly for the problem of conditional density estimation, where although non-parameteric kernel-based approaches that currently dominate generally perform well, a fast, tractable, incremental, Bayesian approach has been lacking. This construction used in this paper employs a random walk in a set of contexts.


Value of Information Lattice: Exploiting Probabilistic Independence for Effective Feature Subset Acquisition

Journal of Artificial Intelligence Research

We address the cost-sensitive feature acquisition problem, where misclassifying an instance is costly but the expected misclassification cost can be reduced by acquiring the values of the missing features. Because acquiring the features is costly as well, the objective is to acquire the right set of features so that the sum of the feature acquisition cost and misclassification cost is minimized. We describe the Value of Information Lattice (VOILA), an optimal and efficient feature subset acquisition framework. Unlike the common practice, which is to acquire features greedily, VOILA can reason with subsets of features. VOILA efficiently searches the space of possible feature subsets by discovering and exploiting conditional independence properties between the features and it reuses probabilistic inference computations to further speed up the process. Through empirical evaluation on five medical datasets, we show that the greedy strategy is often reluctant to acquire features, as it cannot forecast the benefit of acquiring multiple features in combination.


Issues in Stacked Generalization

arXiv.org Artificial Intelligence

Stacked generalization is a general method of using a high-level model to combine lower-level models to achieve greater predictive accuracy. In this paper we address two crucial issues which have been considered to be a `black art' in classification tasks ever since the introduction of stacked generalization in 1992 by Wolpert: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input. We find that best results are obtained when the higher-level model combines the confidence (and not just the predictions) of the lower-level ones. We demonstrate the effectiveness of stacked generalization for combining three different types of learning algorithms for classification tasks. We also compare the performance of stacked generalization with majority vote and published results of arcing and bagging.


Variational Probabilistic Inference and the QMR-DT Network

arXiv.org Artificial Intelligence

We describe a variational approximation method for efficient inference in large-scale probabilistic models. Variational methods are deterministic procedures that provide approximations to marginal and conditional probabilities of interest. They provide alternatives to approximate inference methods based on stochastic sampling or search. We describe a variational approach to the problem of diagnostic inference in the `Quick Medical Reference' (QMR) network. The QMR network is a large-scale probabilistic graphical model built on statistical and expert knowledge. Exact probabilistic inference is infeasible in this model for all but a small set of cases. We evaluate our variational inference algorithm on a large set of diagnostic test cases, comparing the algorithm to a state-of-the-art stochastic sampling method.


Aggregating Forecasts Using a Learned Bayesian Network

AAAI Conferences

Under the Defense Advanced Research Project Agency's (DARPA) Integrated Crisis Early Warning System (ICEWS), Innovative Decisions, Inc. (IDI) constructed a Bayesian network to combine forecasts produced by a set of social science models. We used Bayesian network structure learning with political science variables to produce meaningful priors. We employed a naive Bayes structure to aggregate the forecasts. In both cases, IDI improved classification by intelligently discretizing continuous variables. The resulting network not only met performance criteria set by DARPA, but also out-performed each of the social science models across all types of forecasted events. We describe the construction of the aggregator as well as a set of experiments performed to explore the nature of the Bayesian EOI Aggregator's performance.


Efficient Policy Construction for MDPs Represented in Probabilistic PDDL

AAAI Conferences

We present a novel dynamic programming approach to computing optimal policies for Markov Decision Processes compactly represented in grounded Probabilistic PDDL. Unlike other approaches, which use an intermediate representation as Dynamic Bayesian Networks, we directly exploit the PPDDL description by introducing dedicated backup rules. This provides an alternative approach to DBNs, especially when actions have highly correlated effects on variables. Indeed, we show interesting improvements on several planning domains from the International Planning Competition. Finally, we exploit the incremental flavor of our backup rules for designing promising approaches to policy revision.


A Two-Step Method to Learn Multidimensional Bayesian Network Classifiers Based on Mutual Information Measures

AAAI Conferences

Bayesian Network Classifiers are popular approaches for classification problems where instances have to be assigned to one of several classes. However, in many domains, it is necessary to assign instances to multiple classes at the same time. This task has been normally addressed either by (i) transforming the problem into a single-class scenario by defining a new class variable with all of the possible combinations of classes or, (ii) by building an independent classifier for each class variable. Either way, the resulting models do not capture all the relations and dependencies between classes and features resulting into unprecise multidimensional classifiers. In this paper, we introduce a two-step method for learning Multidimensional Bayesian Network Classifiers (MBC) from data based on mutual information measures. The first step of the method learns an initial MBC structure which then, in the second step, is refined. Our approach is simple and keeps all the interactions and dependencies among classes and features. The method was tested on three benchmark multidimensional data-sets. Preliminary experimental results show how our method outperforms state-of-the-art methods used in multidimensional classification.


Tuning a Bayesian Knowledge Base

AAAI Conferences

For a knowledge-based system that fails to provide the correct answer, it is important to be able to tune the system while minimizing overall change in the knowledge-base. There are a variety of reasons why the answer is incorrect ranging from incorrect knowledge to information vagueness to incompleteness. Still, in all these situations, it is typically the case that most of the knowledge in the system is likely to be correct as specified by the expert(s) and/or knowledge engineer(s). In this paper, we propose a method to identify the possible changes by understanding the contribution of parameters on the outputs of concern. Our approach is based on Bayesian Knowledge Bases for modeling uncertainties. We start with single parameter changes and then extend to multiple parameters. In order to identify the optimal solution that can minimize the change to the model as specified by the domain experts, we define and evaluate the sensitivity values of the results with respect to the parameters. We discuss the computational complexities of determining the solution and show that the problem of multiple parameters changes can be transformed into Linear Programming problems, and thus, efficiently solvable. Our work can also be applied towards validating the knowledge base such that the updated model can satisfy all test-cases collected from the domain experts.


Learning Temporal Nodes Bayesian Networks

AAAI Conferences

Temporal Nodes Bayesian Networks (TNBNs) are an alternative to Dynamic Bayesian Networks for temporal reasoning, that result in much simpler and efficient models in some domains. However, methods for learning this type of models from data have not been developed. In this paper we propose a learning algorithm to obtain the structure and temporal intervals for TNBNs from data. The method has three phases: (i) obtain an initial approximation of the intervals, (ii) obtain a structure using a standard algorithm and (iii) refine the intervals for each temporal node based on a clustering algorithm. We evaluated the method with synthetic data. Our method obtains the best score in terms of the structure and a competitive predictive accuracy.


Modeling Interventions Using Belief Causal Networks

AAAI Conferences

Causality plays an important role in our comprehension of the world. It amounts to determine what truly causes what and what it matters. Interventions allow the identification of elements in a sequence of events that are related in a causal way. In this paper, we introduce belief causation and we proposea method for handling interventions in graphical model under an uncertain environment where the uncertainty is represented by belief masses, so-called belief causal networks. More specifically, we propose a generalization of the “DO” operator and explain the needed changes on the structure of the graph to model a belief causal network on which interventions are proceeded.