AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

Learning Network of Multivariate Hawkes Processes: A Time Series Approach

Etesami, Jalal, Kiyavash, Negar, Zhang, Kun, Singhal, Kushagra

arXiv.org Machine LearningMar-14-2016

Learning the influence structure of multiple time series data is of great interest to many disciplines. This paper studies the problem of recovering the causal structure in network of multivariate linear Hawkes processes. In such processes, the occurrence of an event in one process affects the probability of occurrence of new events in some other processes. Thus, a natural notion of causality exists between such processes captured by the support of the excitation matrix. We show that the resulting causal influence network is equivalent to the Directed Information graph (DIG) of the processes, which encodes the causal factorization of the joint distribution of the processes. Furthermore, we present an algorithm for learning the support of excitation matrix (or equivalently the DIG). The performance of the algorithm is evaluated on synthesized multivariate Hawkes networks as well as a stock market and MemeTracker real-world dataset.

artificial intelligence, hawke process, machine learning, (18 more...)

arXiv.org Machine Learning

1603.04319

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.70)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (0.48)

Technology:

Information Technology > Communications > Social Media (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Top-$K$ Ranking from Pairwise Comparisons: When Spectral Ranking is Optimal

Jang, Minje, Kim, Sunghyun, Suh, Changho, Oh, Sewoong

arXiv.org Machine LearningMar-14-2016

We explore the top-$K$ rank aggregation problem. Suppose a collection of items is compared in pairs repeatedly, and we aim to recover a consistent ordering that focuses on the top-$K$ ranked items based on partially revealed preference information. We investigate the Bradley-Terry-Luce model in which one ranks items according to their perceived utilities modeled as noisy observations of their underlying true utilities. Our main contributions are two-fold. First, in a general comparison model where item pairs to compare are given a priori, we attain an upper and lower bound on the sample size for reliable recovery of the top-$K$ ranked items. Second, more importantly, extending the result to a random comparison model where item pairs to compare are chosen independently with some probability, we show that in slightly restricted regimes, the gap between the derived bounds reduces to a constant factor, hence reveals that a spectral method can achieve the minimax optimality on the (order-wise) sample size required for top-$K$ ranking. That is to say, we demonstrate a spectral method alone to be sufficient to achieve the optimality and advantageous in terms of computational complexity, as it does not require an additional stage of maximum likelihood estimation that a state-of-the-art scheme employs to achieve the optimality. We corroborate our main results by numerical experiments.

artificial intelligence, machine learning, ranking, (13 more...)

arXiv.org Machine Learning

1603.04153

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

A Statistical Decision-Theoretic Framework for Social Choice

Soufiani, Hossein Azari, Parkes, David C., Xia, Lirong

arXiv.org Artificial IntelligenceMar-12-2016

In this paper, we take a statistical decision-theoretic viewpoint on social choice, putting a focus on the decision to be made on behalf of a system of agents. In our framework, we are given a statistical ranking model, a decision space, and a loss function defined on (parameter, decision) pairs, and formulate social choice mechanisms as decision rules that minimize expected loss. This suggests a general framework for the design and analysis of new social choice mechanisms. We compare Bayesian estimators, which minimize Bayesian expected loss, for the Mallows model and the Condorcet model respectively, and the Kemeny rule. We consider various normative properties, in addition to computational complexity and asymptotic behavior. In particular, we show that the Bayesian estimator for the Condorcet model satisfies some desired properties such as anonymity, neutrality, and monotonicity, can be computed in polynomial time, and is asymptotically different from the other two rules when the data are generated from the Condorcet model for some ground truth parameter.

artificial intelligence, kendall, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1410.7856

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Sequential Monte Carlo Methods for System Identification

Schön, Thomas B., Lindsten, Fredrik, Dahlin, Johan, Wågberg, Johan, Naesseth, Christian A., Svensson, Andreas, Dai, Liang

arXiv.org Machine LearningMar-10-2016

One of the key challenges in identifying nonlinear and possibly non-Gaussian state space models (SSMs) is the intractability of estimating the system state. Sequential Monte Carlo (SMC) methods, such as the particle filter (introduced more than two decades ago), provide numerical solutions to the nonlinear state estimation problems arising in SSMs. When combined with additional identification techniques, these algorithms provide solid solutions to the nonlinear system identification problem. We describe two general strategies for creating such combinations and discuss why SMC is a natural tool for implementing these strategies.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ifacol.2015.12.224

1503.06058

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.45)

Genre:

Research Report (0.63)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Note on the equivalence of hierarchical variational models and auxiliary deep generative models

Brümmer, Niko

arXiv.org Machine LearningMar-9-2016

In machine learning, there is an ongoing revival of the use of variational Bayes (VB) to deal with complex probabilistic models with hidden variables. The revival is driven by the use of stochastic methods to approximate the VB lower bound and associated gradients. See for example [1, 2, 5]. The advantages include automated inference [3] and also that they are applicable to a much wider class of probabilistic models. While the basic recipes are limited in the flexibility of the approximate hidden-variable posteriors, there are ongoing efforts to make them more flexible.

machine learning, natural language, posterior, (15 more...)

arXiv.org Machine Learning

1603.02443

Country: Africa (0.15)

Genre: Research Report (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.56)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

Blind Source Separation: Fundamentals and Recent Advances (A Tutorial Overview Presented at SBrT-2001)

Kofidis, Eleftherios

arXiv.org Machine LearningMar-9-2016

A number of people are found in a room and involved in loud conversations in groups, just as it would happen in a cocktail party. There might also be some background noise, which could be music, car noise from outside, etc. Each person in this room is therefore forced to listen to a mixture of speech sounds coming from various directions, along with some noise. These sounds may come directly to one's ear or have first suffered a sequence of reverberations because of their reflections on the room's walls. The problem of focusing one's listening attention on a particular speaker among this cacophony of conversations and noise has been known as the cocktail party problem [6]. It consists of separating a mixture of speech signals of different characteristics with noise added to it. The signals are a-priori unknown (one listens only to a combination of them) as is also the way they have been mixed. The above scenario is a good analog for many other examples of situations that demand for a separation of mixed signals with no presupposed knowledge on the signals and the system mixing them.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1603.03089

Country: North America > United States (0.28)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Health & Medicine > Diagnostic Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Add feedback

Optimized Kernel Entropy Components

Izquierdo-Verdiguier, Emma, Laparra, Valero, Jenssen, Robert, Gómez-Chova, Luis, Camps-Valls, Gustau

arXiv.org Machine LearningMar-9-2016

KECA roughly reduces to a sorting of the importance of kernel eigenvectors by entropy instead of by variance as in Kernel Principal Components Analysis. In this work, we propose an extension of the KECA method, named Optimized KECA (OKECA), that directly extracts the optimal features retaining most of the data entropy by means of compacting the information in very few features (often in just one or two). The proposed method produces features which have higher expressive power . In particular, it is based on the Independent Component Analysis (ICA) framework, and introduces an extra rotation to the eigen-decomposition, which is optimized via gradient ascent search. This maximum entropy preservation suggests that OKECA features are more efficient than KECA features for density estimation. In addition, a critical issue in both methods is the selection of the kernel parameter since it critically affects the resulting performance. Here we analyze the most common kernel length-scale selection criteria. Results of both methods are illustrated in different synthetic and real problems. Results show that 1) OKECA returns projections with more expressive power than KECA, 2) the most successful rule for estimating the kernel parameter is based on maximum likelihood, and 3) OKECA is more robust to the selection of the length-scale parameter in kernel density estimation.

artificial intelligence, machine learning, okeca, (13 more...)

arXiv.org Machine Learning

doi: 10.1109/TNNLS.2016.2530403.

1603.02806

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (0.47)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.90)

Add feedback

Discriminative models for robust image classification

Srinivas, Umamahesh

arXiv.org Machine LearningMar-8-2016

A variety of real-world tasks involve the classification of images into pre-determined categories. Designing image classification algorithms that exhibit robustness to acquisition noise and image distortions, particularly when the available training data are insufficient to learn accurate models, is a significant challenge. This dissertation explores the development of discriminative models for robust image classification that exploit underlying signal structure, via probabilistic graphical models and sparse signal representations. Probabilistic graphical models are widely used in many applications to approximate high-dimensional data in a reduced complexity set-up. Learning graphical structures to approximate probability distributions is an area of active research. Recent work has focused on learning graphs in a discriminative manner with the goal of minimizing classification error. In the first part of the dissertation, we develop a discriminative learning framework that exploits the complementary yet correlated information offered by multiple representations (or projections) of a given signal/image. Specifically, we propose a discriminative tree-based scheme for feature fusion by explicitly learning the conditional correlations among such multiple projections in an iterative manner. Experiments reveal the robustness of the resulting graphical model classifier to training insufficiency.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Machine Learning

1603.02736

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education (1.00)
Energy (0.67)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
(6 more...)

Add feedback

Small ensembles of kriging models for optimization

Mohammadi, Hossein, Riche, Rodolphe Le, Touboul, Eric

arXiv.org Machine LearningMar-8-2016

The Efficient Global Optimization (EGO) algorithm uses a conditional Gaus-sian Process (GP) to approximate an objective function known at a finite number of observation points and sequentially adds new points which maximize the Expected Improvement criterion according to the GP. The important factor that controls the efficiency of EGO is the GP covariance function (or kernel) which should be chosen according to the objective function. Traditionally, a pa-rameterized family of covariance functions is considered whose parameters are learned through statistical procedures such as maximum likelihood or cross-validation. However, it may be questioned whether statistical procedures for learning covariance functions are the most efficient for optimization as they target a global agreement between the GP and the observations which is not the ultimate goal of optimization. Furthermore, statistical learning procedures are computationally expensive. The main alternative to the statistical learning of the GP is self-adaptation, where the algorithm tunes the kernel parameters based on their contribution to objective function improvement. After questioning the possibility of self-adaptation for kriging based optimizers, this paper proposes a novel approach for tuning the length-scale of the GP in EGO: At each iteration, a small ensemble of kriging models structured by their length-scales is created. All of the models contribute to an iterate in an EGO-like fashion. Then, the set of models is densified around the model whose length-scale yielded the best iterate and further points are produced. Numerical experiments are provided which motivate the use of many length-scales. The tested implementation does not perform better than the classical EGO algorithm in a sequential context but show the potential of the approach for parallel implementations.

artificial intelligence, machine learning, optimization, (19 more...)

arXiv.org Machine Learning

1603.02638

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

BIRDNEST: Bayesian Inference for Ratings-Fraud Detection

Hooi, Bryan, Shah, Neil, Beutel, Alex, Gunnemann, Stephan, Akoglu, Leman, Kumar, Mohit, Makhija, Disha, Faloutsos, Christos

arXiv.org Artificial IntelligenceMar-7-2016

Review fraud is a pervasive problem in online commerce, in which fraudulent sellers write or purchase fake reviews to manipulate perception of their products and services. Fake reviews are often detected based on several signs, including 1) they occur in short bursts of time; 2) fraudulent user accounts have skewed rating distributions. However, these may both be true in any given dataset. Hence, in this paper, we propose an approach for detecting fraudulent reviews which combines these 2 approaches in a principled manner, allowing successful detection even when one of these signs is not present. To combine these 2 approaches, we formulate our Bayesian Inference for Rating Data (BIRD) model, a flexible Bayesian model of user rating behavior. Based on our model we formulate a likelihood-based suspiciousness metric, Normalized Expected Surprise Total (NEST). We propose a linear-time algorithm for performing Bayesian inference using our model and computing the metric. Experiments on real data show that BIRDNEST successfully spots review fraud in large, real-world graphs: the 50 most suspicious users of the Flipkart platform flagged by our algorithm were investigated and all identified as fraudulent by domain experts at Flipkart.

artificial intelligence, machine learning, posterior distribution, (20 more...)

arXiv.org Artificial Intelligence

1511.0603

Genre: Research Report (0.64)

Industry: Law Enforcement & Public Safety > Fraud (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback