AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Committee-Based Sample Selection for Probabilistic Classifiers

Argamon-Engelson, S., Dagan, I.

arXiv.org Artificial IntelligenceJun-1-2011

In many real-world learning tasks, it is expensive to acquire a sufficient number of labeled examples for training. This paper investigates methods for reducing annotation cost by `sample selection'. In this approach, during training the learning program examines many unlabeled examples and selects for labeling only those that are most informative at each stage. This avoids redundantly labeling examples that contribute little new information. Our work follows on previous research on Query By Committee, extending the committee-based paradigm to the context of probabilistic classification. We describe a family of empirical methods for committee-based sample selection in probabilistic classification models, which evaluate the informativeness of an example by measuring the degree of disagreement between several model variants. These variants (the committee) are drawn randomly from a probability distribution conditioned by the training set labeled so far. The method was applied to the real-world natural language processing task of stochastic part-of-speech tagging. We find that all variants of the method achieve a significant reduction in annotation cost, although their computational efficiency differs. In particular, the simplest variant, a two member committee with no parameters to tune, gives excellent results. We also show that sample selection yields a significant reduction in the size of the model used by the tagger.

machine learning, natural language, selection, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.612

1106.022

Country:

Asia > Middle East (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

The Good Old Davis-Putnam Procedure Helps Counting Models

Birnbaum, E., Lozinskii, E. L.

arXiv.org Artificial IntelligenceJun-1-2011

As was shown recently, many important AI problems require counting the number of models of propositional formulas. The problem of counting models of such formulas is, according to present knowledge, computationally intractable in a worst case. Based on the Davis-Putnam procedure, we present an algorithm, CDP, that computes the exact number of models of a propositional CNF or DNF formula F. Let m and n be the number of clauses and variables of F, respectively, and let p denote the probability that a literal l of F occurs in a clause C of F, then the average running time of CDP is shown to be O(nm^d), where d=-1/log(1-p). The practical performance of CDP has been estimated in a series of experiments on a wide variety of CNF formulas.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.601

1106.0218

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Context models on sequences of covers

Dimitrakakis, Christos

arXiv.org Machine LearningMay-30-2011

Conditional measure estimation is a fundamental problem in statistics. Specific instances of this problem include classification, regression and conditional density estimation. This paper formulates a general approach for nonparametric, incremental, closed-form Bayesian estimation of conditional measures that relies on a model structure defined on a sequence of covers. This is an important development, particularly for the problem of conditional density estimation, where although non-parameteric kernel-based approaches that currently dominate generally perform well, a fast, tractable, incremental, Bayesian approach has been lacking. This construction used in this paper employs a random walk in a set of contexts.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

1005.2263

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Value of Information Lattice: Exploiting Probabilistic Independence for Effective Feature Subset Acquisition

Bilgic, M., Getoor, L.

Journal of Artificial Intelligence ResearchMay-27-2011

We address the cost-sensitive feature acquisition problem, where misclassifying an instance is costly but the expected misclassification cost can be reduced by acquiring the values of the missing features. Because acquiring the features is costly as well, the objective is to acquire the right set of features so that the sum of the feature acquisition cost and misclassification cost is minimized. We describe the Value of Information Lattice (VOILA), an optimal and efficient feature subset acquisition framework. Unlike the common practice, which is to acquire features greedily, VOILA can reason with subsets of features. VOILA efficiently searches the space of possible feature subsets by discovering and exploiting conditional independence properties between the features and it reuses probabilistic inference computations to further speed up the process. Through empirical evaluation on five medical datasets, we show that the greedy strategy is often reluctant to acquire features, as it cannot forecast the benefit of acquiring multiple features in combination.

constraint, dataset, misclassification cost, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3200

AI Access Foundation

10706

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Maryland > Prince George's County > College Park (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area > Endocrinology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Add feedback

Variational Probabilistic Inference and the QMR-DT Network

Jaakkola, T. S., Jordan, M. I.

arXiv.org Artificial IntelligenceMay-26-2011

We describe a variational approximation method for efficient inference in large-scale probabilistic models. Variational methods are deterministic procedures that provide approximations to marginal and conditional probabilities of interest. They provide alternatives to approximate inference methods based on stochastic sampling or search. We describe a variational approach to the problem of diagnostic inference in the `Quick Medical Reference' (QMR) network. The QMR network is a large-scale probabilistic graphical model built on statistical and expert knowledge. Exact probabilistic inference is infeasible in this model for all but a small set of cases. We evaluate our variational inference algorithm on a large set of diagnostic test cases, comparing the algorithm to a state-of-the-art stochastic sampling method.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.583

1105.5462

Country: North America > United States > California (0.67)

Genre: Research Report (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Probabilistic Deduction with Conditional Constraints over Basic Events

Lukasiewicz, T.

arXiv.org Artificial IntelligenceMay-26-2011

We study the problem of probabilistic deduction with conditional constraints over basic events. We show that globally complete probabilistic deduction with conditional constraints over basic events is NP-hard. We then concentrate on the special case of probabilistic deduction in conditional constraint trees. We elaborate very efficient techniques for globally complete probabilistic deduction. In detail, for conditional constraint trees with point probabilities, we present a local approach to globally complete probabilistic deduction, which runs in linear time in the size of the conditional constraint trees. For conditional constraint trees with interval probabilities, we show that globally complete probabilistic deduction can be done in a global approach by solving nonlinear programs. We show how these nonlinear programs can be transformed into equivalent linear programs, which are solvable in polynomial time in the size of the conditional constraint trees.

artificial intelligence, conditional constraint tree, logic & formal reasoning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.577

1105.5461

Country:

North America > United States (0.46)
Europe > Netherlands (0.27)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off

Seldin, Yevgeny, Cesa-Bianchi, Nicolò, Laviolette, François, Auer, Peter, Shawe-Taylor, John, Peters, Jan

arXiv.org Machine LearningMay-23-2011

We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolving martingales.

big data, pac-bayesian analysis, upstream oil & gas, (22 more...)

arXiv.org Machine Learning

1105.4585

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > Canada (0.14)
Europe > Italy (0.14)
Europe > Austria (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.72)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback

PAC-Bayesian Analysis of Martingales and Multiarmed Bandits

Seldin, Yevgeny, Laviolette, François, Shawe-Taylor, John, Peters, Jan, Auer, Peter

arXiv.org Machine LearningMay-19-2011

We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent random variables. The first is based on a new lemma that enables to bound expectations of convex functions of certain dependent random variables by expectations of the same functions of independent Bernoulli random variables. This lemma provides an alternative tool to Hoeffding-Azuma inequality to bound concentration of martingale values. Our second approach is based on integration of Hoeffding-Azuma inequality with PAC-Bayesian analysis. We also introduce a way to apply PAC-Bayesian analysis in situation of limited feedback. We combine the new tools to derive PAC-Bayesian generalization and regret bounds for the multiarmed bandit problem. Although our regret bound is not yet as tight as state-of-the-art regret bounds based on other well-established techniques, our results significantly expand the range of potential applications of PAC-Bayesian analysis and introduce a new analysis tool to reinforcement learning and many other fields, where martingales and limited feedback are encountered.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1105.2416

Country: Europe > Germany (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Add feedback

Typical models: minimizing false beliefs

Lozinskii, Eliezer L.

arXiv.org Artificial IntelligenceMay-19-2011

A knowledge system S describing a part of real world does in general not contain complete information. Reasoning with incomplete information is prone to errors since any belief derived from S may be false in the present state of the world. A false belief may suggest wrong decisions and lead to harmful actions. So an important goal is to make false beliefs as unlikely as possible. This work introduces the notions of "typical atoms" and "typical models", and shows that reasoning with typical models minimizes the expected number of false beliefs over all ways of using incomplete information. Various properties of typical models are studied, in particular, correctness and stability of beliefs suggested by typical models, and their connection to oblivious reasoning.

logic & formal reasoning, mod, nonmonotonic reasoning, (20 more...)

arXiv.org Artificial Intelligence

1105.3833

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Nonmonotonic Logic (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Aggregating Forecasts Using a Learned Bayesian Network

Mahoney, Suzanne Mitchell (Innovative Decisions, Inc.) | Comstock, Ethan (Innovative Decisions, Inc.) | deBlois, Bradley (Innovative Decisions, Inc.) | Darcy, Steven (Innovative Decisions, Inc.)

AAAI ConferencesMay-18-2011

Under the Defense Advanced Research Project Agency's (DARPA) Integrated Crisis Early Warning System (ICEWS), Innovative Decisions, Inc. (IDI) constructed a Bayesian network to combine forecasts produced by a set of social science models. We used Bayesian network structure learning with political science variables to produce meaningful priors. We employed a naive Bayes structure to aggregate the forecasts. In both cases, IDI improved classification by intelligently discretizing continuous variables. The resulting network not only met performance criteria set by DARPA, but also out-performed each of the social science models across all types of forecasted events. We describe the construction of the aggregator as well as a set of experiments performed to explore the nature of the Bayesian EOI Aggregator's performance.

artificial intelligence, forecaster, machine learning, (15 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Instructional Material (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.91)
Government > Military (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback