Uncertainty
I Don't Want to Think About it Now: Decision Theory with Costly Computation
Halpern, Joseph Y. (Cornell University)
Computation plays a major role in decision making.ย Even if an agent is willing to ascribe a probability to all states and a utility to all outcomes, and maximize expected utility, doing so might present serious computational problems.ย Moreover, computing the outcome of a given act might be difficult.ย In a companion paper we develop a framework for game theoryย with costly computation, where the objects of choice are Turing machines.ย Here we apply that framework to decision theory.ย We show how well-known phenomena like first-impression-matters biases (i.e., people tend to put more weight on evidence they hear early on), belief polarization (two people with different prior beliefs, hearing the same evidence, can end up with diametrically opposed conclusions), and the status quo bias (people are much more likely to stick with what they already have) can be easily captured in that framework.ย Finally, we use the frameworkย to define so me new notions: value of computational information (a computational variant of value of information ) and computational value of conversation .
New Advances in Sequential Diagnosis
Siddiqi, Sajjad Ahmed (National University of Sciences and Technologies) | Huang, Jinbo (NICTA and Australian National University)
Sequential diagnosis takes measurements of an abnormal system to identify faulty components, where the goal is to reduce the diagnostic cost , defined here as the number of measurements. To propose measurement points, previous work employs a heuristic based on reducing the entropy over a set of diagnoses , which can be impractical when the set of diagnoses is too large. Focusing on a smaller set of probable diagnoses scales the approach but generally leads to increased diagnostic cost. We propose a new diagnostic framework employing three new techniques โ a more efficient heuristic for measurement point selection, abstraction-based sequential diagnosis, and component cloning โ which scales to large systems with good performance in terms of diagnostic cost.
A two-step fusion process for multi-criteria decision applied to natural hazards in mountains
Tacnet, Jean-Marc, Batton-Hubert, Mireille, Dezert, Jean
Mountain river torrents and snow avalanches generate human and material damages with dramatic consequences. Knowledge about natural phenomenona is often lacking and expertise is required for decision and risk management purposes using multi-disciplinary quantitative or qualitative approaches. Expertise is considered as a decision process based on imperfect information coming from more or less reliable and conflicting sources. A methodology mixing the Analytic Hierarchy Process (AHP), a multi-criteria aid-decision method, and information fusion using Belief Function Theory is described. Fuzzy Sets and Possibilities theories allow to transform quantitative and qualitative criteria into a common frame of discernment for decision in Dempster-Shafer Theory (DST ) and Dezert-Smarandache Theory (DSmT) contexts. Main issues consist in basic belief assignments elicitation, conflict identification and management, fusion rule choices, results validation but also in specific needs to make a difference between importance and reliability and uncertainty in the fusion process.
Active Learning for Hidden Attributes in Networks
Yan, Xiaoran, Zhu, Yaojia, Rouquier, Jean-Baptiste, Moore, Cristopher
In many networks, vertices have hidden attributes, or types, that are correlated with the networks topology. If the topology is known but these attributes are not, and if learning the attributes is costly, we need a method for choosing which vertex to query in order to learn as much as possible about the attributes of the other vertices. We assume the network is generated by a stochastic block model, but we make no assumptions about its assortativity or disassortativity. We choose which vertex to query using two methods: 1) maximizing the mutual information between its attributes and those of the others (a well-known approach in active learning) and 2) maximizing the average agreement between two independent samples of the conditional Gibbs distribution. Experimental results show that both these methods do much better than simple heuristics. They also consistently identify certain vertices as important by querying them early on.
The Production of Probabilistic Entropy in Structure/Action Contingency Relations
Luhmann (1984) defined society as a communication system which is structurally coupled to, but not an aggregate of, human action systems. The communication system is then considered as self-organizing ("autopoietic"), as are human actors. Communication systems can be studied by using Shannon's (1948) mathematical theory of communication. The update of a network by action at one of the local nodes is then a well-known problem in artificial intelligence (Pearl 1988). By combining these various theories, a general algorithm for probabilistic structure/action contingency can be derived. The consequences of this contingency for each system, its consequences for their further histories, and the stabilization on each side by counterbalancing mechanisms are discussed, in both mathematical and theoretical terms. An empirical example is elaborated.
Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data
Shah, Mohak, Marchand, Mario, Corbeil, Jacques
One of the objectives of designing feature selection learning algorithms is to obtain classifiers that depend on a small number of attributes and have verifiable future performance guarantees. There are few, if any, approaches that successfully address the two goals simultaneously. Performance guarantees become crucial for tasks such as microarray data analysis due to very small sample sizes resulting in limited empirical evaluation. To the best of our knowledge, such algorithms that give theoretical bounds on the future performance have not been proposed so far in the context of the classification of gene expression data. In this work, we investigate the premise of learning a conjunction (or disjunction) of decision stumps in Occam's Razor, Sample Compression, and PAC-Bayes learning settings for identifying a small subset of attributes that can be used to perform reliable classification tasks. We apply the proposed approaches for gene identification from DNA microarray data and compare our results to those of well known successful approaches proposed for the task. We show that our algorithm not only finds hypotheses with much smaller number of genes while giving competitive classification accuracy but also have tight risk guarantees on future performance unlike other approaches. The proposed approaches are general and extensible in terms of both designing novel algorithms and application to other domains.
Planning for Concurrent Action Executions Under Action Duration Uncertainty Using Dynamically Generated Bayesian Networks
Beaudry, Eric (Universite de Sherbrooke) | Kabanza, Froduald (Universite de Sherbrooke) | Michaud, Francois (Universite de Sherbrooke)
An interesting class of planning domains, including planning for daily activities of Mars rovers, involves achievement of goals with time constraints and concurrent actions with probabilistic durations. Current probabilistic approaches, which rely on a discrete time model, introduce a blow up in the search state-space when the two factors of action concurrency and action duration uncertainty are combined. Simulation-based and sampling probabilistic planning approaches would cope with this state explosion by avoiding storing all the explored states in memory, but they remain approximate solution approaches. In this paper, we present an alternative approach relying on a continuous time model which avoids the state explosion caused by time stamping in the presence of action concurrency and action duration uncertainty. Time is represented as a continuous random variable. The dependency between state time variables is conveyed by a Bayesian network, which is dynamically generated by a state-based forward-chaining search based on the action descriptions. A generated plan is characterized by a probability of satisfying a goal. The evaluation of this probability is done by making a query the Bayesian network.
Quantum learning: optimal classification of qubit states
Guta, Madalin, Kotlowski, Wojciech
Pattern recognition is a central topic in Learning Theory with numerous applications such as voice and text recognition, image analysis, computer diagnosis. The statistical set-up in classification is the following: we are given an i.i.d. training set $(X_{1},Y_{1}),... (X_{n},Y_{n})$ where $X_{i}$ represents a feature and $Y_{i}\in \{0,1\}$ is a label attached to that feature. The underlying joint distribution of $(X,Y)$ is unknown, but we can learn about it from the training set and we aim at devising low error classifiers $f:X\to Y$ used to predict the label of new incoming features. Here we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown qubit states. Given a number of `training' copies from each of the states, we would like to `learn' about them by performing a measurement on the training set. The outcome is then used to design mesurements for the classification of future systems with unknown labels. We find the asymptotically optimal classification strategy and show that typically, it performs strictly better than a plug-in strategy based on state estimation. The figure of merit is the excess risk which is the difference between the probability of error and the probability of error of the optimal measurement when the states are known, that is the Helstrom measurement. We show that the excess risk has rate $n^{-1}$ and compute the exact constant of the rate.
Spatio-Temporal Graphical Model Selection
Harrington, Patrick L. Jr., Hero, Alfred O. III
This paper treats the problem of learning the interaction structure of a spatiotemporal graphical model for a discrete state and discrete time stochastic process known as the susceptible, infected, recovered (SIR) model. The presence of spatial interactions cause adjacent nodes in the graph to affect each others states over time. Learning the topology of this graph is known as model selection. We cast this graphical model selection problem as a penalized likelihood problem, resulting in a convex program for which convex optimization solvers can be applied. SIR spatiotemporal graphical models are commonly used in modeling the random propagation of information between nodes in large networks in bioinformatics, signal processing, public health, and national security (4; 9; 21). Knowing the network link structure allows accurate prediction of individual node states and can aid the development of control and intervention strategies for such networks. This paper develops a tractable method to estimate the topology of the network for the SIR spatiotemporal graphical model from empirical data. Exact solutions of the graphical model selection problem is NP hard due to the combinatorial nature of enumeration through the discrete space of possible graph topologies.
Terrorism Event Classification Using Fuzzy Inference Systems
Inyaem, Uraiwan, Haruechaiyasak, Choochart, Meesad, Phayung, Tran, Dat
Terrorism has led to many problems in Thai societies, not only property damage but also civilian casualties. Predicting terrorism activities in advance can help prepare and manage risk from sabotage by these activities. This paper proposes a framework focusing on event classification in terrorism domain using fuzzy inference systems (FISs). Each FIS is a decision-making model combining fuzzy logic and approximate reasoning. It is generated in five main parts: the input interface, the fuzzification interface, knowledge base unit, decision making unit and output defuzzification interface. Adaptive neuro-fuzzy inference system (ANFIS) is a FIS model adapted by combining the fuzzy logic and neural network. The ANFIS utilizes automatic identification of fuzzy logic rules and adjustment of membership function (MF). Moreover, neural network can directly learn from data set to construct fuzzy logic rules and MF implemented in various applications. FIS settings are evaluated based on two comparisons. The first evaluation is the comparison between unstructured and structured events using the same FIS setting. The second comparison is the model settings between FIS and ANFIS for classifying structured events. The data set consists of news articles related to terrorism events in three southern provinces of Thailand. The experimental results show that the classification performance of the FIS resulting from structured events achieves satisfactory accuracy and is better than the unstructured events. In addition, the classification of structured events using ANFIS gives higher performance than the events using only FIS in the prediction of terrorism events.