AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Fast and optimal nonparametric sequential design for astronomical observations

Yang, Justin J., Wang, Xufei, Protopapas, Pavlos, Bornn, Luke

arXiv.org Machine LearningJan-11-2015

The spectral energy distribution (SED) is a relatively easy way for astronomers to distinguish between different astronomical objects such as galaxies, black holes, and stellar objects. By comparing the observations from a source at different frequencies with template models, astronomers are able to infer the type of this observed object. In this paper, we take a Bayesian model averaging perspective to learn astronomical objects, employing a Bayesian nonparametric approach to accommodate the deviation from convex combinations of known log-SEDs. To effectively use telescope time for observations, we then study Bayesian nonparametric sequential experimental design without conjugacy, in which we use sequential Monte Carlo as an efficient tool to maximize the volume of information stored in the posterior distribution of the parameters of interest. A new technique for performing inferences in log-Gaussian Cox processes called the Poisson log-normal approximation is also proposed. Simulations show the speed, accuracy, and usefulness of our method. While the strategy we propose in this paper is brand new in the astronomy literature, the inferential techniques developed apply to more general nonparametric sequential experimental design problems.

artificial intelligence, machine learning, template, (17 more...)

arXiv.org Machine Learning

1501.02467

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Entropic one-class classifiers

Livi, Lorenzo, Sadeghian, Alireza, Pedrycz, Witold

arXiv.org Machine LearningJan-11-2015

The one-class classification problem is a well-known research endeavor in pattern recognition. The problem is also known under different names, such as outlier and novelty/anomaly detection. The core of the problem consists in modeling and recognizing patterns belonging only to a so-called target class. All other patterns are termed non-target, and therefore they should be recognized as such. In this paper, we propose a novel one-class classification system that is based on an interplay of different techniques. Primarily, we follow a dissimilarity representation based approach; we embed the input data into the dissimilarity space by means of an appropriate parametric dissimilarity measure. This step allows us to process virtually any type of data. The dissimilarity vectors are then represented through a weighted Euclidean graphs, which we use to (i) determine the entropy of the data distribution in the dissimilarity space, and at the same time (ii) derive effective decision regions that are modeled as clusters of vertices. Since the dissimilarity measure for the input data is parametric, we optimize its parameters by means of a global optimization scheme, which considers both mesoscopic and structural characteristics of the data represented through the graphs. The proposed one-class classifier is designed to provide both hard (Boolean) and soft decisions about the recognition of test patterns, allowing an accurate description of the classification process. We evaluate the performance of the system on different benchmarking datasets, containing either feature-based or structured patterns. Experimental results demonstrate the effectiveness of the proposed technique.

data mining, machine learning, pattern recognition, (21 more...)

arXiv.org Machine Learning

doi: 10.1109/TNNLS.2015.2418332

1407.7556

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
(2 more...)

Add feedback

Learning the Conditional Independence Structure of Stationary Time Series: A Multitask Learning Approach

Jung, Alexander

arXiv.org Machine LearningJan-11-2015

E consider a stationary discrete-time vector process or time series. Such a process could model, e.g., the time evolution of air pollutant concentrations [1], [2] or medical diagnostic data obtained in electrocorticography (ECoG) [3]. One specific way of representing the dependence structure of a vector process is via a graphical model [4], where the nodes of the graph represent the individual scalar process components, and the edges represent statistical relations between the individual process components. More precisely, the (undirected) edges of a conditional independence graph (CIG) associated with a process represent conditional independence statements about the process components [4], [1]. In particular, two nodes in the CIG are connected by an edge if and only if the two corresponding process components are conditionally dependent, given the remaining process components. Note that the so defined CIG for time series extends the basic notion of a CIG for random vectors by considering dependencies between entire time series instead of dependencies between scalar random variables [5], [6]. In this work, we investigate the problem of graphical model selection (GMS), i.e., that of inferring the CIG of a time series, given a finite-length observation. A. Jung is with the Institute of Telecommunications, Vienna University of Technology, 1040-Vienna, Austria email: ajung@nt.tuwien.ac.at.

algorithm 1, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2015.2460219

1404.1361

Country:

Europe > Austria > Vienna (0.74)
North America > United States (0.67)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

Coherent Predictive Inference under Exchangeability with Imprecise Probabilities

De Cooman, Gert, De Bock, Jasper, Diniz, Márcio Alves

Journal of Artificial Intelligence ResearchJan-10-2015

Coherent reasoning under uncertainty can be represented in a very general manner by coherent sets of desirable gambles. In a context that does not allow for indecision, this leads to an approach that is mathematically equivalent to working with coherent conditional probabilities. If we do allow for indecision, this leads to a more general foundation for coherent (imprecise-)probabilistic inference. In this framework, and for a given finite category set, coherent predictive inference under exchangeability can be represented using Bernstein coherent cones of multivariate polynomials on the simplex generated by this category set. This is a powerful generalisation of de Finetti's Representation Theorem allowing for both imprecision and indecision. We define an inference system as a map that associates a Bernstein coherent cone of polynomials with every finite category set. Many inference principles encountered in the literature can then be interpreted, and represented mathematically, as restrictions on such maps. We discuss, as particular examples, two important inference principles: representation insensitivitya strengthened version of Walley's representation invarianceand specificity. We show that there is an infinity of inference systems that satisfy these two principles, amongst which we discuss in particular the skeptically cautious inference system, the inference systems corresponding to (a modified version of) Walley and Bernard's Imprecise Dirichlet Multinomial Models (IDMM), the skeptical IDMM inference systems, and the Haldane inference system. We also prove that the latter produces the same posterior inferences as would be obtained using Haldane's improper prior, implying that there is an infinity of proper priors that produce the same coherent posterior inferences as Haldane's improper one. Finally, we impose an additional inference principle that allows us to characterise uniquely the immediate predictions for the IDMM inference systems.

category, equation, inference system, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4490

AI Access Foundation

10925

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

The SP theory of intelligence: an overview

Wolff, J. Gerard

arXiv.org Artificial IntelligenceJan-7-2015

This article is an overview of the "SP theory of intelligence". The theory aims to simplify and integrate concepts across artificial intelligence, mainstream computing and human perception and cognition, with information compression as a unifying theme. It is conceived as a brain-like system that receives 'New' information and stores some or all of it in compressed form as 'Old' information. It is realised in the form of a computer model -- a first version of the SP machine. The concept of "multiple alignment" is a powerful central idea. Using heuristic techniques, the system builds multiple alignments that are 'good' in terms of information compression. For each multiple alignment, probabilities may be calculated. These provide the basis for calculating the probabilities of inferences. The system learns new structures from partial matches between patterns. Using heuristic techniques, the system searches for sets of structures that are 'good' in terms of information compression. These are normally ones that people judge to be 'natural', in accordance with the 'DONSVIC' principle -- the discovery of natural structures via information compression. The SP theory may be applied in several areas including 'computing', aspects of mathematics and logic, representation of knowledge, natural language processing, pattern recognition, several kinds of reasoning, information storage and retrieval, planning and problem solving, information compression, neuroscience, and human perception and cognition. Examples include the parsing and production of language including discontinuous dependencies in syntax, pattern recognition at multiple levels of abstraction and its integration with part-whole relations, nonmonotonic reasoning and reasoning with default values, reasoning in Bayesian networks including 'explaining away', causal diagnosis, and the solving of a geometric analogy problem.

machine learning, multiple alignment, pattern recognition, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/info4030283

1306.3888

Country:

North America > United States (0.46)
Europe (0.27)

Genre: Research Report (0.81)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(4 more...)

Add feedback

Inverse Renormalization Group Transformation in Bayesian Image Segmentations

Tanaka, Kazuyuki, Kataoka, Shun, Yasuda, Muneki, Ohzeki, Masayuki

arXiv.org Machine LearningJan-5-2015

Graduate School of Informatics, Kyoto University, 36-1 Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501 Japan A new Bayesian image segmentation algorithm is proposed by combining a loopy belief propagation with an inverse real space renormalization group transformation to reduce the computational time. In results of our experiment, we observe that the proposed method can reduce the computational time to less than one-tenth of that taken by conventional Bayesian approaches. Bayesian segmentation modeling based on Markov random fields (MRF's) is one of the interesting research topics We consider an image as defined on a set of pixels arranged on a square grid graph (V,E). HereV { i i 1, 2,···, V } denotes the set of all the pixels andE is the set of all the nearest-neighbour pairs of pixels{ i,j} . The total numbers of elements in the setsV and E are denoted by V and E, respectively.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.7566/JPSJ.84.045001

1501.00834

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.45)
Asia > Japan > Honshū > Tōhoku (0.15)

Genre: Research Report (0.40)

Industry: Education (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Declarative Statistical Modeling with Datalog

Barany, Vince, Cate, Balder ten, Kimelfeld, Benny, Olteanu, Dan, Vagena, Zografoula

arXiv.org Artificial IntelligenceJan-5-2015

Formalisms for specifying statistical models, such as probabilistic-programming languages, typically consist of two components: a specification of a stochastic process (the prior), and a specification of observations that restrict the probability space to a conditional subspace (the posterior). Use cases of such formalisms include the development of algorithms in machine learning and artificial intelligence. We propose and investigate a declarative framework for specifying statistical models on top of a database, through an appropriate extension of Datalog. By virtue of extending Datalog, our framework offers a natural integration with the database, and has a robust declarative semantics. Our Datalog extension provides convenient mechanisms to include numerical probability functions; in particular, conclusions of rules may contain values drawn from such functions. The semantics of a program is a probability distribution over the possible outcomes of the input database with respect to the program; these outcomes are minimal solutions with respect to a related program with existentially quantified variables in conclusions. Observations are naturally incorporated by means of integrity constraints over the extensional and intensional relations. We focus on programs that use discrete numerical distributions, but even then the space of possible outcomes may be uncountable (as a solution can be infinite). We define a probability measure over possible outcomes by applying the known concept of cylinder sets to a probabilistic chase procedure. We show that the resulting semantics is robust under different chases. We also identify conditions guaranteeing that all possible outcomes are finite (and then the probability space is discrete). We argue that the framework we propose retains the purely declarative nature of Datalog, and allows for natural specifications of statistical models.

machine learning, natural language, possible outcome, (19 more...)

arXiv.org Artificial Intelligence

1412.2221

Country:

Europe (0.67)
Asia > Middle East (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Concave Penalized Estimation of Sparse Gaussian Bayesian Networks

Aragam, Bryon, Zhou, Qing

arXiv.org Machine LearningJan-4-2015

We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data. In contrast to recent methods which accelerate the learning problem by restricting the search space, our main contribution is a fast algorithm for score-based structure learning which does not restrict the search space in any way and works on high-dimensional datasets with thousands of variables. Our use of concave regularization, as opposed to the more popular $\ell_0$ (e.g. BIC) penalty, is new. Moreover, we provide theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian. Most notably, our framework does not require the existence of a so-called faithful DAG representation, and as a result the theory must handle the inherent nonidentifiability of the estimation problem in a novel way. Finally, as a matter of independent interest, we provide a comprehensive comparison of our approach to several standard structure learning methods using open-source packages developed for the R language. Based on these experiments, we show that our algorithm is significantly faster than other competing methods while obtaining higher sensitivity with comparable false discovery rates for high-dimensional data. In particular, the total runtime for our method to generate a solution path of 20 estimates for DAGs with 8000 nodes is around one hour.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1401.0852

Country:

Europe (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.27)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Passing Expectation Propagation Messages with Kernel Methods

Jitkrittum, Wittawat, Gretton, Arthur, Heess, Nicolas

arXiv.org Machine LearningJan-2-2015

We propose to learn a kernel-based message operator which takes as input all expectation propagation (EP) incoming messages to a factor node and produces an outgoing message. In ordinary EP, computing an outgoing message involves estimating a multivariate integral which may not have an analytic expression. Learning such an operator allows one to bypass the expensive computation of the integral during inference by directly mapping all incoming messages into an outgoing message. The operator can be learned from training data (examples of input and output messages) which allows automated inference to be made on any kind of factor that can be sampled.

artificial intelligence, incoming message, machine learning, (15 more...)

arXiv.org Machine Learning

1501.00375

Country: North America > United States > New York (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Shape and Illumination from Shading using the Generic Viewpoint Assumption

Zoran, Daniel, Krishnan, Dilip, Bento, José, Freeman, Bill

Neural Information Processing SystemsDec-31-2014

The Generic Viewpoint Assumption (GVA) states that the position of the viewer or the light in a scene is not special. Thus, any estimated parameters from an observation should be stable under small perturbations such as object, viewpoint or light positions. The GVA has been analyzed and quantified in previous works, but has not been put to practical use in actual vision tasks. In this paper, we show how to utilize the GVA to estimate shape and illumination from a single shading image, without the use of other priors. We propose a novel linearized Spherical Harmonics (SH) shading model which enables us to obtain a computationally efficient form of the GVA term. Together with a data term, we build a model whose unknowns are shape and SH illumination. The model parameters are estimated using the Alternating Direction Method of Multipliers embedded in a multi-scale estimation framework. In this prior-free framework, we obtain competitive shape and illumination estimation results under a variety of models and lighting conditions, requiring fewer assumptions than competing methods.

bayesian inference, illumination, upstream oil & gas, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback