AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Randomized Algorithms for the Loop Cutset Problem

Bar-Yehuda, R., Becker, A., Geiger, D.

arXiv.org Artificial IntelligenceJun-1-2011

We show how to find a minimum weight loop cutset in a Bayesian network with high probability. Finding such a loop cutset is the first step in the method of conditioning for inference. Our randomized algorithm for finding a loop cutset outputs a minimum loop cutset after O(c 6^k kn) steps with probability at least 1 - (1 - 1/(6^k))^c6^k, where c > 1 is a constant specified by the user, k is the minimal size of a minimum weight loop cutset, and n is the number of vertices. We also show empirically that a variant of this algorithm often finds a loop cutset that is closer to the minimum weight loop cutset than the ones found by the best deterministic algorithms known.

artificial intelligence, machine learning, vertex, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.638

1106.0225

Country: North America > United States > California (0.46)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Committee-Based Sample Selection for Probabilistic Classifiers

Argamon-Engelson, S., Dagan, I.

arXiv.org Artificial IntelligenceJun-1-2011

In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, extending the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to the real-world natural language processing task of stochastic part-of-speech tagging. We find that all variants of the method achieve a significant reduction in annotation cost, although their computational efficiency differs. In particular, the simplest variant, a two member committee with no parameters to tune, gives excellent results. We also show that sample selection yields a significant reduction in the size of the model used by the tagger.

machine learning, natural language, selection, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.612

1106.022

Country:

Asia > Middle East (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

The Good Old Davis-Putnam Procedure Helps Counting Models

Birnbaum, E., Lozinskii, E. L.

arXiv.org Artificial IntelligenceJun-1-2011

As was shown recently, many important AI problems require counting the number of models of propositional formulas. The problem of counting models of such formulas is, according to present knowledge, computationally intractable in a worst case. Based on the Davis-Putnam procedure, we present an algorithm, CDP, that computes the exact number of models of a propositional CNF or DNF formula F. Let m and n be the number of clauses and variables of F, respectively, and let p denote the probability that a literal l of F occurs in a clause C of F, then the average running time of CDP is shown to be O(nm^d), where d=-1/log(1-p). The practical performance of CDP has been estimated in a series of experiments on a wide variety of CNF formulas.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.601

1106.0218

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Context models on sequences of covers

Dimitrakakis, Christos

arXiv.org Machine LearningMay-30-2011

Conditional measure estimation is a fundamental problem in statistics. Specific instances of this problem include classification, regression and conditional density estimation. This paper formulates a general approach for nonparametric, incremental, closed-form Bayesian estimation of conditional measures that relies on a model structure defined on a sequence of covers. This is an important development, particularly for the problem of conditional density estimation, where although non-parameteric kernel-based approaches that currently dominate generally perform well, a fast, tractable, incremental, Bayesian approach has been lacking. This construction used in this paper employs a random walk in a set of contexts.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

1005.2263

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Value of Information Lattice: Exploiting Probabilistic Independence for Effective Feature Subset Acquisition

Bilgic, M., Getoor, L.

Journal of Artificial Intelligence ResearchMay-27-2011

We address the cost-sensitive feature acquisition problem, where misclassifying an instance is costly but the expected misclassification cost can be reduced by acquiring the values of the missing features. Because acquiring the features is costly as well, the objective is to acquire the right set of features so that the sum of the feature acquisition cost and misclassification cost is minimized. We describe the Value of Information Lattice (VOILA), an optimal and efficient feature subset acquisition framework. Unlike the common practice, which is to acquire features greedily, VOILA can reason with subsets of features. VOILA efficiently searches the space of possible feature subsets by discovering and exploiting conditional independence properties between the features and it reuses probabilistic inference computations to further speed up the process. Through empirical evaluation on five medical datasets, we show that the greedy strategy is often reluctant to acquire features, as it cannot forecast the benefit of acquiring multiple features in combination.

constraint, dataset, misclassification cost, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3200

AI Access Foundation

10706

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Endocrinology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Add feedback

Issues in Stacked Generalization

Ting, K. M., Witten, I. H.

arXiv.org Artificial IntelligenceMay-26-2011

Stacked generalization is a general method of using a high-level model to combine lower-level models to achieve greater predictive accuracy. In this paper we address two crucial issues which have been considered to be a `black art' in classification tasks ever since the introduction of stacked generalization in 1992 by Wolpert: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input. We find that best results are obtained when the higher-level model combines the confidence (and not just the predictions) of the lower-level ones. We demonstrate the effectiveness of stacked generalization for combining three different types of learning algorithms for classification tasks. We also compare the performance of stacked generalization with majority vote and published results of arcing and bagging.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.594

1105.5466

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Variational Probabilistic Inference and the QMR-DT Network

Jaakkola, T. S., Jordan, M. I.

arXiv.org Artificial IntelligenceMay-26-2011

We describe a variational approximation method for efficient inference in large-scale probabilistic models. Variational methods are deterministic procedures that provide approximations to marginal and conditional probabilities of interest. They provide alternatives to approximate inference methods based on stochastic sampling or search. We describe a variational approach to the problem of diagnostic inference in the `Quick Medical Reference' (QMR) network. The QMR network is a large-scale probabilistic graphical model built on statistical and expert knowledge. Exact probabilistic inference is infeasible in this model for all but a small set of cases. We evaluate our variational inference algorithm on a large set of diagnostic test cases, comparing the algorithm to a state-of-the-art stochastic sampling method.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.583

1105.5462

Country: North America > United States > California (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Aggregating Forecasts Using a Learned Bayesian Network

Mahoney, Suzanne Mitchell (Innovative Decisions, Inc.) | Comstock, Ethan (Innovative Decisions, Inc.) | deBlois, Bradley (Innovative Decisions, Inc.) | Darcy, Steven (Innovative Decisions, Inc.)

AAAI ConferencesMay-18-2011

Under the Defense Advanced Research Project Agency's (DARPA) Integrated Crisis Early Warning System (ICEWS), Innovative Decisions, Inc. (IDI) constructed a Bayesian network to combine forecasts produced by a set of social science models. We used Bayesian network structure learning with political science variables to produce meaningful priors. We employed a naive Bayes structure to aggregate the forecasts. In both cases, IDI improved classification by intelligently discretizing continuous variables. The resulting network not only met performance criteria set by DARPA, but also out-performed each of the social science models across all types of forecasted events. We describe the construction of the aggregator as well as a set of experiments performed to explore the nature of the Bayesian EOI Aggregator's performance.

artificial intelligence, forecaster, machine learning, (15 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Instructional Material (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.91)
Government > Military (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Efficient Policy Construction for MDPs Represented in Probabilistic PDDL

Lesner, Boris (University of Caen Basse-Normandie) | Zanuttini, Bruno (University of Caen Basse-Normandie)

AAAI ConferencesMay-18-2011

We present a novel dynamic programming approach to computing optimal policies for Markov Decision Processes compactly represented in grounded Probabilistic PDDL. Unlike other approaches, which use an intermediate representation as Dynamic Bayesian Networks, we directly exploit the PPDDL description by introducing dedicated backup rules. This provides an alternative approach to DBNs, especially when actions have highly correlated effects on variables. Indeed, we show interesting improvements on several planning domains from the International Planning Competition. Finally, we exploit the incremental flavor of our backup rules for designing promising approaches to policy revision.

algorithm, representation, value function, (15 more...)

AAAI Conferences

Twenty-First International Conference on Automated Planning and Scheduling

Country: Europe > France (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Two-Step Method to Learn Multidimensional Bayesian Network Classifiers Based on Mutual Information Measures

Zaragoza, Julio Cesar (National Institute of Astrophysics, Optics and Electronics) | Sucar, Enrique (National Institute of Astrophysics, Optics and Electronics) | Morales, Eduardo (National Institute of Astrophysics, Optics and Electronics)

AAAI ConferencesMay-18-2011

Bayesian Network Classifiers are popular approaches for classification problems where instances have to be assigned to one of several classes. However, in many domains, it is necessary to assign instances to multiple classes at the same time. This task has been normally addressed either by (i) transforming the problem into a single-class scenario by defining a new class variable with all of the possible combinations of classes or, (ii) by building an independent classifier for each class variable. Either way, the resulting models do not capture all the relations and dependencies between classes and features resulting into unprecise multidimensional classifiers. In this paper, we introduce a two-step method for learning Multidimensional Bayesian Network Classifiers (MBC) from data based on mutual information measures. The first step of the method learns an initial MBC structure which then, in the second step, is refined. Our approach is simple and keeps all the interactions and dependencies among classes and features. The method was tested on three benchmark multidimensional data-sets. Preliminary experimental results show how our method outperforms state-of-the-art methods used in multidimensional classification.

algorithm, class variable, mbc structure, (10 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

North America > Mexico > Puebla > Puebla (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre: Research Report (0.54)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback