Goto

Collaborating Authors

 Uncertainty


A New Technique for Combining Multiple Classifiers using The Dempster-Shafer Theory of Evidence

Journal of Artificial Intelligence Research

This paper presents a new classifier combination technique based on the Dempster-Shafer theory of evidence. The Dempster-Shafer theory of evidence is a powerful method for combining measures of evidence from different classifiers. However, since each of the available methods that estimates the evidence of classifiers has its own limitations, we propose here a new implementation which adapts to training data so that the overall mean square error is minimized. The proposed technique is shown to outperform most available classifier combination methods when tested on three different classification problems.


An AI-Based Approach to Destination Control in Elevators

AI Magazine

Not widely known by the AI community, elevator control has become a major field of application for AI technologies. Techniques such as neural networks, genetic algorithms, fuzzy rules and, recently, multiagent systems and AI planning have been adopted by leading elevator companies not only to improve the transportation capacity of conventional elevator systems but also to revolutionize the way in which elevators interact with and serve passengers. In this article, we begin with an overview of AI techniques adopted by this industry and explain the motivations behind the continuous interest in AI. We review and summarize publications that are not easily accessible from the common AI sources. In the second part, we present in more detail a recent development project to apply AI planning and multiagent systems to elevator control problems.


When do Numbers Really Matter?

Journal of Artificial Intelligence Research

Common wisdom has it that small distinctions in the probabilities (parameters) quantifying a belief network do not matter much for the results of probabilistic queries. Yet, one can develop realistic scenarios under which small variations in network parameters can lead to significant changes in computed queries. A pending theoretical question is then to analytically characterize parameter changes that do or do not matter. In this paper, we study the sensitivity of probabilistic queries to changes in network parameters and prove some tight bounds on the impact that such parameters can have on queries. Our analytic results pinpoint some interesting situations under which parameter changes do or do not matter. These results are important for knowledge engineers as they help them identify influential network parameters. They also help explain some of the previous experimental results and observations with regards to network robustness against parameter changes.


AAAI 2002 Fall Symposium Series Reports

AI Magazine

The Association for the Advancement of Artificial Intelligence held its 2001 Fall Symposium Series November 2-4, 2001 at the Sea Crest Conference Center in North Falmouth, Massachusetts. The topics of the five symposia in the 2001 Fall Symposia Series were (1) Anchoring Symbols to Sensor Data in Single and Multiple Robot Systems, (2) Emotional and Intelligent II: The Tangled Knot of Social Cognition, (3) Intent Inference for Collaborative Tasks, (4) Negotiation Methods for Autonomous Cooperative Systems, and (5) Using Uncertainty within Computation. This article contains brief reports of those five symposia.


Robust Feature Selection by Mutual Information Distributions

arXiv.org Artificial Intelligence

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.


Entropy estimation of symbol sequences

arXiv.org Machine Learning

We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our interest is in describing their convergence with sequence length, assuming no limits for the space and time complexities of the compression algorithms. A scaling law is proposed for extrapolation from finite sample lengths. This is applied to sequences of dynamical systems in non-trivial chaotic regimes, a 1-D cellular automaton, and to written English texts.


Reasoning with Cause and Effect

AI Magazine

This article is an edited transcript of a lecture given at IJCAI-99, Stockholm, Sweden, on 4 August 1999. The article summarizes concepts, principles, and tools that were found useful in applications involving causal modeling. The principles are based on structural-model semantics in which functional (or counterfactual) relationships representing autonomous physical processes are the fundamental building blocks. The article presents the conceptual basis of this semantics, illustrates its application in simple problems, and discusses its ramifications to computational and cognitive problems concerning causation.


Generalized Belief Propagation

Neural Information Processing Systems

Belief propagation (BP) was only supposed to work for treelike networks but works surprisingly well in many applications involving networks with loops, including turbo codes. However, there has been little understanding of the algorithm or the nature of the solutions it finds for general graphs. We show that BP can only converge to a stationary point of an approximate free energy, known as the Bethe free energy in statistical physics.This result characterizes BP fixed-points and makes connections with variational approaches to approximate inference. More importantly, our analysis lets us build on the progress made in statistical physics since Bethe's approximation was introduced in 1935. Kikuchi and others have shown how to construct more accurate freeenergy approximations, of which Bethe's approximation is the simplest.


Generalized Belief Propagation

Neural Information Processing Systems

For general networks with loops, the situation is much less clear. On the one hand, a number of researchers have empirically demonstrated good performance for BP algorithms applied to networks with loops. One dramatic case is the near Shannon-limit performance of "Turbo codes", whose decoding algorithm is equivalent to BP on a loopy network [2, 6]. For some problems in computer vision involving networks with loops, BP has also shown to be accurate and to converge very quickly [2, 1, 7]. On the other hand, for other networks with loops, BP may give poor results or fail to converge [7]. For a general graph, little has been understood about what approximation BP represents, and how it might be improved. This paper's goal is to provide that understanding and introduce a set of new algorithms resulting from that understanding. We show that BP is the first in a progression of local message-passing algorithms, each giving equivalent results to a corresponding approximation from statistical physics known as the "Kikuchi" approximation to the Gibbs free energy. These algorithms have the attractive property of being user-adjustable: by paying some additional computational cost, one can obtain considerable improvement in the accuracy of one's approximation, and can sometimes obtain a convergent message-passing algorithm when ordinary BP does not converge.


Ensemble Learning and Linear Response Theory for ICA

Neural Information Processing Systems

We propose a general Bayesian framework for performing independent (leA) which relies on ensemble learning and linearcomponent analysis response theory known from statistical physics. We apply it to both discrete and continuous sources. For the continuous source the underdetermined (overcomplete) case is studied. The naive mean-field approach fails in this case whereas linear response theory-which gives an improved estimate of covariances-is very efficient. The examples given are for sources without temporal correlations. However, this derivation can easily to treat temporal correlations. Finally, the frameworkbe extended of generating new leA algorithms without needingoffers a simple way to define the prior distribution of the sources explicitly.