AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Mathematics as information compression via the matching and unification of patterns

Wolff, J Gerard

arXiv.org Artificial IntelligenceAug-5-2018

This paper describes a novel perspective on the foundations of mathematics: how mathematics may be seen to be largely about 'information compression via the matching and unification of patterns' (ICMUP). ICMUP is itself a novel approach to information compression, couched in terms of non-mathematical primitives, as is necessary in any investigation of the foundations of mathematics. This new perspective on the foundations of mathematics has grown out of an extensive programme of research developing the "SP Theory of Intelligence" and its realisation in the "SP Computer Model", a system in which a generalised version of ICMUP -- the powerful concept of SP-multiple-alignment -- plays a central role. These ideas may be seen to be part of a "Big Picture" comprising six areas of interest, with information compression as a unifying theme. The paper describes the close relation between mathematics and information compression, and describes examples showing how variants of ICMUP may be seen in widely-used structures and operations in mathematics. Examples are also given to show how the mathematics-related disciplines of logic and computing may be understood as ICMUP. There are many potential benefits and applications of these ideas.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1808.07004

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Theoretical Aspects of Cyclic Structural Causal Models

Bongers, Stephan, Peters, Jonas, Schölkopf, Bernhard, Mooij, Joris M.

arXiv.org Artificial IntelligenceAug-4-2018

Structural causal models (SCMs), also known as (non-parametric) structural equation models (SEMs), are widely used for causal modeling purposes. A large body of theoretical results is available for the special case in which cycles are absent (i.e., acyclic SCMs, also known as recursive SEMs). However, in many application domains cycles are abundantly present, for example in the form of feedback loops. In this paper, we provide a general and rigorous theory of cyclic SCMs. The paper consists of two parts: the first part gives a rigorous treatment of structural causal models, dealing with measure-theoretic and other complications that arise in the presence of cycles. In contrast with the acyclic case, in cyclic SCMs solutions may no longer exist, or if they exist, they may no longer be unique, or even measurable in general. We give several sufficient and necessary conditions for the existence of (unique) measurable solutions. We show how causal reasoning proceeds in these models and how this differs from the acyclic case. Moreover, we give an overview of the Markov properties that hold for cyclic SCMs. In the second part, we address the question of how one can marginalize an SCM (possibly with cycles) to a subset of the endogenous variables. We show that under a certain condition, one can effectively remove a subset of the endogenous variables from the model, leading to a more parsimonious marginal SCM that preserves the causal and counterfactual semantics of the original SCM on the remaining variables. Moreover, we show how the marginalization relates to the latent projection and to latent confounders, i.e. latent common causes.

artificial intelligence, machine learning, scm, (16 more...)

arXiv.org Artificial Intelligence

1611.06221

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Add feedback

Multi-objective optimization to explicitly account for model complexity when learning Bayesian Networks

Cazzaniga, Paolo, Nobile, Marco S., Ramazzotti, Daniele

arXiv.org Machine LearningAug-3-2018

Bayesian Networks have been widely used in the last decades in many fields, to describe statistical dependencies among random variables. In general, learning the structure of such models is a problem with considerable theoretical interest that still poses many challenges. On the one hand, this is a well-known NP-complete problem, which is practically hardened by the huge search space of possible solutions. On the other hand, the phenomenon of I-equivalence, i.e., different graphical structures underpinning the same set of statistical dependencies, may lead to multimodal fitness landscapes further hindering maximum likelihood approaches to solve the task. Despite all these difficulties, greedy search methods based on a likelihood score coupled with a regularization term to account for model complexity, have been shown to be surprisingly effective in practice. In this paper, we consider the formulation of the task of learning the structure of Bayesian Networks as an optimization problem based on a likelihood score. Nevertheless, our approach do not adjust this score by means of any of the complexity terms proposed in the literature; instead, it accounts directly for the complexity of the discovered solutions by exploiting a multi-objective optimization procedure. To this extent, we adopt NSGA-II and define the first objective function to be the likelihood of a solution and the second to be the number of selected arcs. We thoroughly analyze the behavior of our method on a wide set of simulated data, and we discuss the performance considering the goodness of the inferred solutions both in terms of their objective functions and with respect to the retrieved structure. Our results show that NSGA-II can converge to solutions characterized by better likelihood and less arcs than classic approaches, although paradoxically frequently characterized by a lower similarity to the target network.

artificial intelligence, machine learning, nsga-ii, (17 more...)

arXiv.org Machine Learning

1808.01345

Country:

Europe > Italy > Lombardy > Milan (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Information-Theoretic Scoring Rules to Learn Additive Bayesian Network Applied to Epidemiology

Kratzer, Gilles, Furrer, Reinhard

arXiv.org Machine LearningAug-3-2018

Bayesian network modelling is a well adapted approach to study messy and highly correlated datasets which are very common in, e.g., systems epidemiology. A popular approach to learn a Bayesian network from an observational datasets is to identify the maximum a posteriori network in a search-and-score approach. Many scores have been proposed both Bayesian or frequentist based. In an applied perspective, a suitable approach would allow multiple distributions for the data and is robust enough to run autonomously. A promising framework to compute scores are generalized linear models. Indeed, there exists fast algorithms for estimation and many tailored solutions to common epidemiological issues. The purpose of this paper is to present an R package abn that has an implementation of multiple frequentist scores and some realistic simulations that show its usability and performance. It includes features to deal efficiently with data separation and adjustment which are very common in systems epidemiology.

artificial intelligence, implementation, machine learning, (17 more...)

arXiv.org Machine Learning

1808.01126

Country:

Europe > Switzerland > Zürich > Zürich (0.15)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Epidemiology (1.00)
Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Histogram Transform-based Speaker Identification

Ma, Zhanyu, Yu, Hong

arXiv.org Machine LearningAug-2-2018

A novel text-independent speaker identification (SI) method is proposed. This method uses the Mel-frequency Cepstral coefficients (MFCCs) and the dynamic information among adjacent frames as feature sets to capture speaker's characteristics. In order to utilize dynamic information, we design super-MFCCs features by cascading three neighboring MFCCs frames together. The probability density function (PDF) of these super-MFCCs features is estimated by the recently proposed histogram transform~(HT) method, which generates more training data by random transforms to realize the histogram PDF estimation and recedes the commonly occurred discontinuity problem in multivariate histograms computing. Compared to the conventional PDF estimation methods, such as Gaussian mixture models, the HT model shows promising improvement in the SI performance.

artificial intelligence, machine learning, mixture model, (18 more...)

arXiv.org Machine Learning

1808.00959

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.88)

Add feedback

Fast yet Simple Natural-Gradient Descent for Variational Inference in Complex Models

Khan, Mohammad Emtiyaz, Nielsen, Didrik

arXiv.org Machine LearningAug-2-2018

Bayesian inference plays an important role in advancing machine learning, but faces computational challenges when applied to complex models such as deep neural networks. Variational inference circumvents these challenges by formulating Bayesian inference as an optimization problem and solving it using gradient-based optimization. In this paper, we argue in favor of natural-gradient approaches which, unlike their gradient-based counterparts, can improve convergence by exploiting the information geometry of the solutions. We show how to derive fast yet simple natural-gradient updates by using a duality associated with exponential-family distributions. An attractive feature of these methods is that, by using natural-gradients, they are able to extract accurate local approximations for individual model components. We summarize recent results for Bayesian deep learning showing the superiority of natural-gradient approaches over their gradient counterparts.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1807.04489

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Mobile big data analysis with machine learning

Xie, Jiyang, Song, Zeyu, Li, Yupeng, Ma, Zhanyu

arXiv.org Machine LearningAug-2-2018

Wi-Fi) and the second/third/fourth generation (2/3/4G) mobile network, the number of mobile phones, which is 7.74 billion, 103.5 per 100 inhabitants all over the world in 2017, is rising dramatically [1]. Nowadays, mobile phone can not only send voice and text messages, but also easily and conveniently access the Internet which has been recognized as the most revolutionary development of Mobile Internet (M-Internet). Meanwhile, worldwide active mobile-broadband subscriptions in 2017 have increased to 4.22 billion, which is 9.21% higher than that in 2016 [1]. Figure 1 shows the numbers of mobile-cellular telephone and active mobile-broadband subscriptions of the world and main districts from 2010 to 2017. The numbers which are up to the bars are the mobile-cellular telephone or active mobile-broadband subscriptions (million) in the world of the year which increase each year. Under the M-Internet, various kinds of content (image, voice, video, etc.) can be sent and received everywhere and the related applications emerge to satisfy people's requirements, including working, study, daily life, entertainment, education, healthcare, etc. In China, mobile applications giants, i.e., Baidu, Alibaba and Tencent, held 78% of M-Internet online time per day in App which was about 2,412 minutes in 2017 [2]. This figure indicates that M-Internet has entered a rapidly growth stage.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

1808.00803

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)
(3 more...)

Genre:

Overview (0.93)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.46)

Industry:

Telecommunications (1.00)
Health & Medicine (1.00)
Information Technology > Services (0.68)
Energy > Power Industry (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(7 more...)

Add feedback

Efficient Bayesian Inference of Sigmoidal Gaussian Cox Processes

Donner, Christian, Opper, Manfred

arXiv.org Machine LearningAug-2-2018

We present an approximate Bayesian inference approach for estimating the intensity of a inhomogeneous Poisson process, where the intensity function is modelled using a Gaussian process (GP) prior via a sigmoid link function. Augmenting the model using a latent marked Poisson process and P\'olya--Gamma random variables we obtain a representation of the likelihood which is conjugate to the GP prior. We approximate the posterior using a free--form mean field approximation together with the framework of sparse GPs. Furthermore, as alternative approximation we suggest a sparse Laplace approximation of the posterior, for which an efficient expectation--maximisation algorithm is derived to find the posterior's mode. Results of both algorithms compare well with exact inference obtained by a Markov Chain Monte Carlo sampler and standard variational Gauss approach, while being one order of magnitude faster.

artificial intelligence, machine learning, poisson process, (15 more...)

arXiv.org Machine Learning

1808.00831

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy

Xie, Yuan, Liu, Boyi, Liu, Qiang, Wang, Zhaoran, Zhou, Yuan, Peng, Jian

arXiv.org Machine LearningAug-1-2018

When learning from a batch of logged bandit feedback, the discrepancy between the policy to be learned and the off-policy training data imposes statistical and computational challenges. Unlike classical supervised learning and online learning settings, in batch contextual bandit learning, one only has access to a collection of logged feedback from the actions taken by a historical policy, and expect to learn a policy that takes good actions in possibly unseen contexts. Such a batch learning setting is ubiquitous in online and interactive systems, such as ad platforms and recommendation systems. Existing approaches based on inverse propensity weights, such as Inverse Propensity Scoring (IPS) and Policy Optimizer for Exponential Models (POEM), enjoy unbiasedness but often suffer from large mean squared error. In this work, we introduce a new approach named Maximum Likelihood Inverse Propensity Scoring (MLIPS) for batch learning from logged bandit feedback. Instead of using the given historical policy as the proposal in inverse propensity weights, we estimate a maximum likelihood surrogate policy based on the logged action-context pairs, and then use this surrogate policy as the proposal. We prove that MLIPS is asymptotically unbiased, and moreover, has a smaller nonasymptotic mean squared error than IPS. Such an error reduction phenomenon is somewhat surprising as the estimated surrogate policy is less accurate than the given historical policy. Results on multi-label classification problems and a large- scale ad placement dataset demonstrate the empirical effectiveness of MLIPS. Furthermore, the proposed surrogate policy technique is complementary to existing error reduction techniques, and when combined, is able to consistently boost the performance of several widely used approaches.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1808.00232

Country:

North America > United States > Indiana (0.04)
North America > United States > Illinois (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Approximate Inference-based Motion Planning by Learning and Exploiting Low-Dimensional Latent Variable Models

Ha, Jung-Su, Chae, Hyeok-Joo, Choi, Han-Lim

arXiv.org Artificial IntelligenceAug-1-2018

Personal use of this material is permitted. Abstract--This work presents an efficient framework to generate a motion plan of a robot with high degrees of freedom (e.g., a humanoid robot). High-dimensionality of the robot configuration space often leads to difficulties in utilizing the widely-used motion planning algorithms, since the volume of the decision space increases exponentially with the number of dimensions. T o handle complications arising from the large decision space, and to solve a corresponding motion planning problem efficiently, two key concepts are adopted in this work: First, the Gaussian process latent variable model (GP-L VM) is utilized for low-dimensional representation of the original configuration space. Second, an approximate inference algorithm is used, exploiting through the duality between control and estimation, to explore the decision space and to compute a high-quality motion trajectory of the robot. Utilizing the GP-L VM and the duality between control and estimation, we construct a fully probabilistic generative model with which a high-dimensional motion planning problem is transformed into a tractable inference problem. Finally, we compute the motion trajectory via an approximate inference algorithm based on a variant of the particle filter . OR robotic motion planning, a trajectory is designed for robot states through a complex configuration space from an initial state to perform a given task. The planning problem is formulated as an optimal control (OC) problem considering the robot dynamics for a feasible motion trajectory and the cost function for the task.

artificial intelligence, particle, trajectory, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2018.2856915

1711.08275

Country:

North America > United States (0.14)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.50)

Industry: Aerospace & Defense (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback