Uncertainty
Bayesian Prediction for Artificial Intelligence
Self, Matthew, Cheeseman, Peter
This paper shows that the common method used for making predictions under uncertainty in A1 and science is in error. This method is to use currently available data to select the best model from a given class of models-this process is called abduction-and then to use this model to make predictions about future data. The correct method requires averaging over all the models to make a prediction-we call this method transduction. Using transduction, an AI system will not give misleading results when basing predictions on small amounts of data, when no model is clearly best. For common classes of models we show that the optimal solution can be given in closed form.
Do We Need Higher-Order Probabilities and, If So, What Do They Mean?
The apparent failure of individual probabilistic expressions to distinguish uncertainty about truths from uncertainty about probabilistic assessments have prompted researchers to seek formalisms where the two types of uncertainties are given notational distinction. This paper demonstrates that the desired distinction is already a built-in feature of classical probabilistic models, thus, specialized notations are unnecessary.
Belief in Belief Functions: An Examination of Shafer's Canonical Examples
EXAMINATION OF SHAFER'S CANONICAL EXAMPLES Kathryn Blackmond Laskey Decision Science Consortium, Inc. 7700 Leesburg Pike, Suite 421 Falls Church, VA 22043 1 Abstract In the canonical examples underlying Shafer-Dempster theory, beliefs over the hypotheses of interest are derived from a probability model for a set of auxiliary hypotheses. Beliefs are derived via a compatibility relation connecting the auxiliary hypotheses to subsets of the primary hypotheses. A belief function differs from a Bayesian probability model in that one does not condition on those parts of the evidence for which no probabilities are specified. The significance of this difference in conditioning assumptions is illustrated with two examples giving rise to identical belief functions but different Bayesian probability distributions. Introduction The artificial intelligence community is in the midst of a lively debate over the representation and manipulation of uncertainty.
Higher Order Probabilities
A number of writers have supposed that for the full specification of belief, higher order probabilities are required. Some have even supposed that there may be an unending sequence of higher order probabilities of probabilities of probabilities.... In the present paper we show that higher order probabilities can always be replaced by the marginal distributions of joint probability distributions. We consider both the case in which higher order probabilities are of the same sort as lower order probabilities and that in which higher order probabilities are distinct in character, as when lower order probabilities are construed as frequencies and higher order probabilities are construed as subjective degrees of belief. In neither case do higher order probabilities appear to offer any advantages, either conceptually or computationally.
Dempster-Shafer vs. Probabilistic Logic
For example, Nilsson in [6] and Grosof in [3] have considered methods for reasoning with sets of probability assignments generated by probabilistic equality and inequality constraints1. Following Nilsson, I use the expression "Probabilistic Logic" to denote the collection of such methods. The aim of these methods is to compute a set of possible probabilities for a given statement from the specified set of probability assignments. If the set of probability assignments is generated by probabilistic equality and inequality constraints, the possible probabilities for a given statement form an interval. Since Dempster-Shafer also associates an interval with each statement A, namely the interval bounded by Bel(A) and Pls(A), the question arises as to the connection between Dempster-Shafer belief functions and sets of probability assignments defined by equality and inequality constraints. Grosof [3] has shown that the latter is a generalization of the former: every Dempster-Shafer belief function is representable by a set of probability assignments arising from equality and inequality constraints, but not vice-versa. A related issue concerns the connection between Dempster's rule of combination and the combination of evidence statements in probabilistic logic. Grosof [2] states some results concerning conditions under which these two methods of combining evidence yield the same result. The aim of this paper is to generalize Grosof's results and to investigate how divergent the two
Is Shafer General Bayes?
This paper examines the relationship between Shafer's belief functions and convex sets of probability distributions. Kyburg's (1986) result showed that belief function models form a subset of the class of closed convex probability distributions. This paper emphasizes the importance of Kyburg's result by looking at simple examples involving Bernoulli trials. Furthermore, it is shown that many convex sets of probability distributions generate the same belief function in the sense that they support the same lower and upper values. This has implications for a decision theoretic extension. Dempster's rule of combination is also compared with Bayes' rule of conditioning.
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems
Karger, David R., Oh, Sewoong, Shah, Devavrat
Crowdsourcing systems, in which numerous tasks are electronically distributed to numerous "information piece-workers", have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all such systems must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in an appropriate manner, e.g. majority voting. In this paper, we consider a general model of such crowdsourcing tasks and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give a new algorithm for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm, inspired by belief propagation and low-rank matrix approximation, significantly outperforms majority voting and, in fact, is optimal through comparison to an oracle that knows the reliability of every worker. Further, we compare our approach with a more general class of algorithms which can dynamically assign tasks. By adaptively deciding which questions to ask to the next arriving worker, one might hope to reduce uncertainty more efficiently. We show that, perhaps surprisingly, the minimum price necessary to achieve a target reliability scales in the same manner under both adaptive and non-adaptive scenarios. Hence, our non-adaptive approach is order-optimal under both scenarios. This strongly relies on the fact that workers are fleeting and can not be exploited. Therefore, architecturally, our results suggest that building a reliable worker-reputation system is essential to fully harnessing the potential of adaptive designs.
Dialectics of Knowledge Representation in a Granular Rough Set Theory
The concepts of rough and definite objects are relatively more determinate than those of granules and granulation in general rough set theory (RST) [1]. Representation of rough objects can however depend on the dialectical relation between granulation and definiteness. In this research, we make this exact in the context of RST over proto-transitive approximation spaces. This approach can be directly extended to many other types of RST. These are used for formulating an extended concept of knowledge interpretation (KI)(relative the situation for classical RST) and the problem of knowledge representation (KR) is solved. These will be of direct interest in granular KR in RST as developed by the present author [2] and of rough objects in general. In [3], these have already been used for five different semantics by the present author. This is an extended version of [4] with key examples and more results.
Bipolar Fuzzy Soft sets and its applications in decision making problem
Aslam, Muhammad, Abdullah, Saleem, ullah, Kifayat
In this article, we combine the concept of a bipolar fuzzy set and a soft set. We introduce the notion of bipolar fuzzy soft set and study fundamental properties. We study basic operations on bipolar fuzzy soft set. We define exdended union, intersection of two bipolar fuzzy soft set. We also give an application of bipolar fuzzy soft set into decision making problem. We give a general algorithm to solve decision making problems by using bipolar fuzzy soft set.
Network Detection Theory and Performance
Smith, Steven T., Senne, Kenneth D., Philips, Scott, Kao, Edward K., Bernstein, Garrett
Network detection is an important capability in many areas of applied research in which data can be represented as a graph of entities and relationships. Oftentimes the object of interest is a relatively small subgraph in an enormous, potentially uninteresting background. This aspect characterizes network detection as a "big data" problem. Graph partitioning and network discovery have been major research areas over the last ten years, driven by interest in internet search, cyber security, social networks, and criminal or terrorist activities. The specific problem of network discovery is addressed as a special case of graph partitioning in which membership in a small subgraph of interest must be determined. Algebraic graph theory is used as the basis to analyze and compare different network detection methods. A new Bayesian network detection framework is introduced that partitions the graph based on prior information and direct observations. The new approach, called space-time threat propagation, is proved to maximize the probability of detection and is therefore optimum in the Neyman-Pearson sense. This optimality criterion is compared to spectral community detection approaches which divide the global graph into subsets or communities with optimal connectivity properties. We also explore a new generative stochastic model for covert networks and analyze using receiver operating characteristics the detection performance of both classes of optimal detection techniques.