AITopics

Inferring key unobservable features of individuals is an important task in the applied sciences. In particular, an important source of data in fields such as marketing, social sciences and medicine is questionnaires: answers in such questionnaires are noisy measures of target unobserved features. While comprehensive surveys help to better estimate the latent variables of interest, aiming at a high number of questions comes at a price: refusal to participate in surveys can go up, as well as the rate of missing data; quality of answers can decline; costs associated with applying such questionnaires can also increase. In this paper, we cast the problem of refining existing models for questionnaire data as follows: solve a constrained optimization problem of preserving the maximum amount of information found in a latent variable model using only a subset of existing questions. The goal is to find an optimal subset of a given size. For that, we first define an information theoretical measure for quantifying the quality of a reduced questionnaire. Three different approximate inference methods are introduced to solve this problem. Comparisons against a simple but powerful heuristic are presented.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Country: Europe > United Kingdom (0.68)

Genre: Questionnaire & Opinion Survey (1.00)

Industry: Health & Medicine > Health Care Providers & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Hernández-lobato, Daniel, Hernández-lobato, Jose M., Dupont, Pierre

Robust Multi-Class Gaussian Process Classification

Multi-class Gaussian Process Classifiers (MGPCs) are often affected by over-fitting problems when labeling errors occur far from the decision boundaries. To prevent this, we investigate a robust MGPC (RMGPC) which considers labeling errors independently of their distance to the decision boundaries. Expectation propagation is used for approximate inference. Experiments with several datasets in which noise is injected in the class labels illustrate the benefits of RMGPC. This method performs better than other Gaussian process alternatives based on considering latent Gaussian noise or heavy-tailed processes. When no noise is injected in the labels, RMGPC still performs equal or better than the other methods. Finally, we show how RMGPC can be used for successfully identifying data instances which are difficult to classify accurately in practice.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

Country:

Europe > United Kingdom (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Le, Hai-son P., Bar-joseph, Ziv

Inferring Interaction Networks using the IBP applied to microRNA Target Prediction

Determining interactions between entities and the overall organization and clustering of nodes in networks is a major challenge when analyzing biological and social network data. Here we extend the Indian Buffet Process (IBP), a nonparametric Bayesian model, to integrate noisy interaction scores with properties of individual entities for inferring interaction networks and clustering nodes within these networks. We present an application of this method to study how microRNAs regulate mRNAs in cells. Analysis of synthetic and real data indicates that the method improves upon prior methods, correctly recovers interactions and clusters, and provides accurate biological predictions.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Kobayashi, Ryota, Tsubo, Yasuhiro, Lansky, Petr, Shinomoto, Shigeru

Estimating time-varying input signals and ion channel states from a single voltage trace of a neuron

State-of-the-art statistical methods in neuroscience have enabled us to fit mathematical models to experimental data and subsequently to infer the dynamics of hidden parameters underlying the observable phenomena. Here, we develop a Bayesian method for inferring the time-varying mean and variance of the synaptic input, along with the dynamics of each ion channel from a single voltage trace of a neuron. An estimation problem may be formulated on the basis of the state-space model with prior distributions that penalize large fluctuations in these parameters. After optimizing the hyperparameters by maximizing the marginal likelihood, the state-space model provides the time-varying parameters of the input signals and the ion channel states. The proposed method is tested not only on the simulated data from the Hodgkin-Huxley type models but also on experimental data obtained from a cortical slice in vitro.

artificial intelligence, input signal, machine learning, (16 more...)

Country: Asia > Japan > Honshū (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Unifying Non-Maximum Likelihood Learning Objectives with Minimum KL Contraction

Lyu, Siwei

When used to learn high dimensional parametric probabilistic models, the clas- sical maximum likelihood (ML) learning often suffers from computational in- tractability, which motivates the active developments of non-ML learning meth- ods. Yet, because of their divergent motivations and forms, the objective func- tions of many non-ML learning methods are seemingly unrelated, and there lacks a unified framework to understand them. In this work, based on an information geometric view of parametric learning, we introduce a general non-ML learning principle termed as minimum KL contraction, where we seek optimal parameters that minimizes the contraction of the KL divergence between the two distributions after they are transformed with a KL contraction operator. We then show that the objective functions of several important or recently developed non-ML learn- ing methods, including contrastive divergence [12], noise-contrastive estimation [11], partial likelihood [7], non-local contrastive objectives [31], score match- ing [14], pseudo-likelihood [3], maximum conditional likelihood [17], maximum mutual information [2], maximum marginal likelihood [9], and conditional and marginal composite likelihood [24], can be unified under the minimum KL con- traction framework with different choices of the KL contraction operators.

Ackerman, Nathanael L., Freer, Cameron E., Roy, Daniel M.

On the computability of conditional probability

arXiv.org Machine LearningDec-16-2011

As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe even more complex probabilistic models and inference algorithms. What are the limits of mechanizing probabilistic inference? We investigate the computability of conditional probability, a fundamental notion in probability theory and a cornerstone of Bayesian statistics, and show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms, even inefficient ones. Specifically, we construct a pair of computable random variables in the unit interval such that the conditional distribution of the first variable given the second encodes the halting problem. Nevertheless, probabilistic inference is possible in many common modeling settings, and we prove several results giving broadly applicable conditions under which conditional distributions are computable. In particular, conditional distributions become computable when measurements are corrupted by independent computable noise with a sufficiently smooth density.

artificial intelligence, machine learning, random variable, (17 more...)

1005.3014

Country:

Europe (0.93)
North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Journal of Artificial Intelligence ResearchDec-13-2011

Finding Consensus Bayesian Network Structures

Peña, J. M.

Suppose that multiple experts (or learning algorithms) provide us with alternative Bayesian network (BN) structures over a domain, and that we are interested in combining them into a single consensus BN structure. Specifically, we are interested in that the consensus BN structure only represents independences all the given BN structures agree upon and that it has as few parameters associated as possible. In this paper, we prove that there may exist several non-equivalent consensus BN structures and that finding one of them is NP-hard. Thus, we decide to resort to heuristics to find an approximated consensus BN structure. In this paper, we consider the heuristic proposed by Matzkevich and Abramson, which builds upon two algorithms, called Methods A and B, for efficiently deriving the minimal directed independence map of a BN structure relative to a given node ordering. Methods A and B are claimed to be correct although no proof is provided (a proof is just sketched). In this paper, we show that Methods A and B are not correct and propose a correction of them.

dag, denote, node, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3427

AI Access Foundation

10736

Journal of Artificial Intelligence Research

Country: Europe > Sweden > Östergötland County > Linköping (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Hernández-Lobato, José Miguel, Hernández-Lobato, Daniel

Convergent Expectation Propagation in Linear Models with Spike-and-slab Priors

arXiv.org Machine LearningDec-10-2011

Exact inference in the linear regression model with spike and slab priors is often intractable. Expectation propagation (EP) can be used for approximate inference. However, the regular sequential form of EP (R-EP) may fail to converge in this model when the size of the training set is very small. As an alternative, we propose a provably convergent EP algorithm (PC-EP). PC-EP is proved to minimize an energy function which, under some constraints, is bounded from below and whose stationary points coincide with the solution of R-EP. Experiments with synthetic data indicate that when R-EP does not converge, the approximation generated by PC-EP is often better. By contrast, when R-EP converges, both methods perform similarly.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1112.2289

Country:

Europe (0.69)
North America > United States > California > San Francisco County > San Francisco (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Simon, Noah, Tibshirani, Rob

Discriminant Analysis with Adaptively Pooled Covariance

arXiv.org Machine LearningDec-6-2011

Linear and Quadratic Discriminant analysis (LDA/QDA) are common tools for classification problems. For these methods we assume observations are normally distributed within group. We estimate a mean and covariance matrix for each group and classify using Bayes theorem. With LDA, we estimate a single, pooled covariance matrix, while for QDA we estimate a separate covariance matrix for each group. Rarely do we believe in a homogeneous covariance structure between groups, but often there is insufficient data to separately estimate covariance matrices. We propose 1-PDA, a regularized model which adaptively pools elements of the precision matrices. Adaptively pooling these matrices decreases the variance of our estimates (as in LDA), without overly biasing them. In this paper, we propose and discuss this method, give an efficient algorithm to fit it for moderate sized problems, and show its efficacy on real and simulated datasets. Keywords: Lasso, Penalized, Discriminant Analysis, Interactions, Classification 1. Introduction Consider the usual two class problem: our data consists ofn observations, each observation with a known class label { 1, 2}, and p covariates measured per observation.

artificial intelligence, bayesian inference, machine learning, (13 more...)

1111.1687

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

arXiv.org Machine LearningDec-3-2011

Dimension adaptability of Gaussian process models with variable selection and projection

Tokdar, Surya T.

It is now known that an extended Gaussian process model equipped with rescaling can adapt to different smoothness levels of a function valued parameter in many nonparametric Bayesian analyses, offering a posterior convergence rate that is optimal (up to logarithmic factors) for the smoothness class the true function belongs to. This optimal rate also depends on the dimension of the function's domain and one could potentially obtain a faster rate of convergence by casting the analysis in a lower dimensional subspace that does not amount to any loss of information about the true function. In general such a subspace is not known a priori but can be explored by equipping the model with variable selection or linear projection. We demonstrate that for nonparametric regression, classification, density estimation and density regression, a rescaled Gaussian process model equipped with variable selection or linear projection offers a posterior convergence rate that is optimal (up to logarithmic factors) for the lowest dimension in which the analysis could be cast without any loss of information about the true function. Theoretical exploration of such dimension reduction features appears novel for Bayesian nonparametric models with or without Gaussian processes.

artificial intelligence, machine learning, regression, (15 more...)

1112.0716

Genre: Research Report (0.82)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)