AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Simultaneously sparse and low-rank abundance matrix estimation for hyperspectral image unmixing

Giampouras, Paris, Themelis, Konstantinos, Rontogiannis, Athanasios, Koutroumbas, Konstantinos

arXiv.org Machine LearningOct-14-2015

In a plethora of applications dealing with inverse problems, e.g. in image processing, social networks, compressive sensing, biological data processing etc., the signal of interest is known to be structured in several ways at the same time. This premise has recently guided the research to the innovative and meaningful idea of imposing multiple constraints on the parameters involved in the problem under study. For instance, when dealing with problems whose parameters form sparse and low-rank matrices, the adoption of suitably combined constraints imposing sparsity and low-rankness, is expected to yield substantially enhanced estimation results. In this paper, we address the spectral unmixing problem in hyperspectral images. Specifically, two novel unmixing algorithms are introduced, in an attempt to exploit both spatial correlation and sparse representation of pixels lying in homogeneous regions of hyperspectral images. To this end, a novel convex mixed penalty term is first defined consisting of the sum of the weighted $\ell_1$ and the weighted nuclear norm of the abundance matrix corresponding to a small area of the image determined by a sliding square window. This penalty term is then used to regularize a conventional quadratic cost function and impose simultaneously sparsity and row-rankness on the abundance matrix. The resulting regularized cost function is minimized by a) an incremental proximal sparse and low-rank unmixing algorithm and b) an algorithm based on the alternating minimization method of multipliers (ADMM). The effectiveness of the proposed algorithms is illustrated in experiments conducted both on simulated and real data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TGRS.2016.2551327

1504.01515

Genre: Research Report (1.00)

Industry: Information Technology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Varying-coefficient models with isotropic Gaussian process priors

Bussas, Matthias, Sawade, Christoph, Scheffer, Tobias, Landwehr, Niels

arXiv.org Machine LearningOct-14-2015

We study learning problems in which the conditional distribution of the output given the input varies as a function of additional task variables. In varying-coefficient models with Gaussian process priors, a Gaussian process generates the functional relationship between the task variables and the parameters of this conditional. Varying-coefficient models subsume hierarchical Bayesian multitask models, but also generalizations in which the conditional varies continuously, for instance, in time or space. However, Bayesian inference in varying-coefficient models is generally intractable. We show that inference for varying-coefficient models with isotropic Gaussian process priors resolves to standard inference for a Gaussian process that can be solved efficiently. MAP inference in this model resolves to multitask learning using task and instance kernels, and inference for hierarchical Bayesian multitask models can be carried out efficiently using graph-Laplacian kernels. We report on experiments for geospatial prediction.

artificial intelligence, machine learning, varying-coefficient model, (17 more...)

arXiv.org Machine Learning

1508.07192

Country:

North America > United States (1.00)
Europe (0.68)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Real Estate (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

The intrinsic value of HFO features as a biomarker of epileptic activity

Gliske, Stephen V., Moon, Kevin R., Stacey, William C., Hero, Alfred O. III

arXiv.org Machine LearningOct-12-2015

High frequency oscillations (HFOs) are a promising biomarker of epileptic brain tissue and activity. HFOs additionally serve as a prototypical example of challenges in the analysis of discrete events in high-temporal resolution, intracranial EEG data. Two primary challenges are 1) dimensionality reduction, and 2) assessing feasibility of classification. Dimensionality reduction assumes that the data lie on a manifold with dimension less than that of the feature space. However, previous HFO analyses have assumed a linear manifold, global across time, space (i.e. recording electrode/channel), and individual patients. Instead, we assess both a) whether linear methods are appropriate and b) the consistency of the manifold across time, space, and patients. We also estimate bounds on the Bayes classification error to quantify the distinction between two classes of HFOs (those occurring during seizures and those occurring due to other processes). This analysis provides the foundation for future clinical use of HFO features and buides the analysis for other discrete events, such as individual action potentials or multi-unit activity.

artificial intelligence, intrinsic dimension, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/ICASSP.2016.7472887

1510.03507

Country: North America > United States > Michigan (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Remarks on kernel Bayes' rule

Johno, Hisashi, Nakamoto, Kazunori, Saigo, Tatsuhiko

arXiv.org Machine LearningOct-10-2015

Kernel Bayes' rule has been proposed as a nonparametric kernel-based method to realize Bayesian inference in reproducing kernel Hilbert spaces. However, we demonstrate both theoretically and experimentally that the prediction result by kernel Bayes' rule is in some cases unnatural. We consider that this phenomenon is in part due to the fact that the assumptions in kernel Bayes' rule do not hold in general.

artificial intelligence, kernel bayes, machine learning, (16 more...)

arXiv.org Machine Learning

1507.01059

Country:

Asia > Japan (0.28)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Distilling Model Knowledge

Papamakarios, George

arXiv.org Machine LearningOct-8-2015

Top-performing machine learning systems, such as deep neural networks, large ensembles and complex probabilistic graphical models, can be expensive to store, slow to evaluate and hard to integrate into larger systems. Ideally, we would like to replace such cumbersome models with simpler models that perform equally well. In this thesis, we study knowledge distillation, the idea of extracting the knowledge contained in a complex model and injecting it into a more convenient model. We present a general framework for knowledge distillation, whereby a convenient model of our choosing learns how to mimic a complex model, by observing the latter's behaviour and being penalized whenever it fails to reproduce it. We develop our framework within the context of three distinct machine learning applications: (a) model compression, where we compress large discriminative models, such as ensembles of neural networks, into models of much smaller size; (b) compact predictive distributions for Bayesian inference, where we distil large bags of MCMC samples into compact predictive distributions in closed form; (c) intractable generative models, where we distil unnormalizable models such as RBMs into tractable models such as NADEs. We contribute to the state of the art with novel techniques and ideas. In model compression, we describe and implement derivative matching, which allows for better distillation when data is scarce. In compact predictive distributions, we introduce online distillation, which allows for significant savings in memory. Finally, in intractable generative models, we show how to use distilled models to robustly estimate intractable quantities of the original model, such as its intractable partition function.

artificial intelligence, distillation, machine learning, (20 more...)

arXiv.org Machine Learning

1510.02437

Country: North America (0.27)

Genre:

Research Report > Promising Solution (0.85)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Bayesian Markov Blanket Estimation

Kaufmann, Dinu, Parbhoo, Sonali, Wieczorek, Aleksander, Keller, Sebastian, Adametz, David, Roth, Volker

arXiv.org Machine LearningOct-6-2015

This paper considers a Bayesian view for estimating a sub-network in a Markov random field. The sub-network corresponds to the Markov blanket of a set of query variables, where the set of potential neighbours here is big. We factorize the posterior such that the Markov blanket is conditionally independent of the network of the potential neighbours. By exploiting this blockwise decoupling, we derive analytic expressions for posterior conditionals. Subsequently, we develop an inference scheme which makes use of the factorization. As a result, estimation of a sub-network is possible without inferring an entire network. Since the resulting Gibbs sampler scales linearly with the number of variables, it can handle relatively large neighbourhoods. The proposed scheme results in faster convergence and superior mixing of the Markov chain than existing Bayesian network estimation techniques.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1510.01485

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias

Kondo, Yohei, Hayashi, Kohei, Maeda, Shin-ichi

arXiv.org Machine LearningOct-6-2015

A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights. The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1509.01004

Country: Asia > Japan > Honshū (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Inference via Approximation of Log-likelihood for Priors in Exponential Family

Ardeshiri, Tohid, Orguner, Umut, Gustafsson, Fredrik

arXiv.org Machine LearningOct-5-2015

In this paper, a Bayesian inference technique based on Taylor series approximation of the logarithm of the likelihood function is presented. The proposed approximation is devised for the case, where the prior distribution belongs to the exponential family of distributions. The logarithm of the likelihood function is linearized with respect to the sufficient statistic of the prior distribution in exponential family such that the posterior obtains the same exponential family form as the prior. Similarities between the proposed method and the extended Kalman filter for nonlinear filtering are illustrated. Furthermore, an extended target measurement update for target models where the target extent is represented by a random matrix having an inverse Wishart distribution is derived. The approximate update covers the important case where the spread of measurement is due to the target extent as well as the measurement noise in the sensor.

approximation, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1510.01225

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Improved Estimation of Class Prior Probabilities through Unlabeled Data

Matloff, Norman

arXiv.org Machine LearningOct-5-2015

Work in the classification literature has shown that in computing a classification function, one need not know the class membership of all observations in the training set; the unlabeled observations still provide information on the marginal distribution of the feature set, and can thus contribute to increased classification accuracy for future observations. The present paper will show that this scheme can also be used for the estimation of class prior probabilities, which would be very useful in applications in which it is difficult or expensive to determine class membership. Both parametric and nonparametric estimators are developed. Asymptotic distributions of the estimators are derived, and it is proven that the use of the unlabeled observations does reduce asymptotic variance. This methodology is also extended to the estimation of subclass probabilities.

artificial intelligence, machine learning, unlabeled data, (17 more...)

arXiv.org Machine Learning

1510.01422

Country: North America > United States > California (0.68)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
(2 more...)

Add feedback

Symbol Emergence in Robotics: A Survey

Taniguchi, Tadahiro, Nagai, Takayuki, Nakamura, Tomoaki, Iwahashi, Naoto, Ogata, Tetsuya, Asoh, Hideki

arXiv.org Artificial IntelligenceSep-29-2015

Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.

artificial intelligence, machine learning, symbol system, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/01691864.2016.1164622

1509.08973

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Middle East > Jordan (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback