AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Bayesian Kernel and Mutual $k$-Nearest Neighbor Regression

Kim, Hyun-Chul

arXiv.org Machine LearningAug-3-2016

We propose Bayesian extensions of two nonparametric regression methods which are kernel and mutual $k$-nearest neighbor regression methods. Derived based on Gaussian process models for regression, the extensions provide distributions for target value estimates and the framework to select the hyperparameters. It is shown that both the proposed methods asymptotically converge to kernel and mutual $k$-nearest neighbor regression methods, respectively. The simulation results show that the proposed methods can select proper hyperparameters and are better than or comparable to the former methods for an artificial data set and a real world data set.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Machine Learning

1608.0141

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Robust Non-linear Regression: A Greedy Approach Employing Kernels with Application to Image Denoising

Papageorgiou, George, Bouboulis, Pantelis, Theodoridis, Sergios

arXiv.org Machine LearningAug-3-2016

We consider the task of robust non-linear regression in the presence of both inlier noise and outliers. Assuming that the unknown non-linear function belongs to a Reproducing Kernel Hilbert Space (RKHS), our goal is to estimate the set of the associated unknown parameters. Due to the presence of outliers, common techniques such as the Kernel Ridge Regression (KRR) or the Support Vector Regression (SVR) turn out to be inadequate. Instead, we employ sparse modeling arguments to explicitly model and estimate the outliers, adopting a greedy approach. The proposed robust scheme, i.e., Kernel Greedy Algorithm for Robust Denoising (KGARD), is inspired by the classical Orthogonal Matching Pursuit (OMP) algorithm. Specifically, the proposed method alternates between a KRR task and an OMP-like selection step. Theoretical results concerning the identification of the outliers are provided. Moreover, KGARD is compared against other cutting edge methods, where its performance is evaluated via a set of experiments with various types of noise. Finally, the proposed robust estimation framework is applied to the task of image denoising, and its enhanced performance in the presence of outliers is demonstrated.

artificial intelligence, machine learning, outlier, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2017.2708029

1601.00595

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Blocking Collapsed Gibbs Sampler for Latent Dirichlet Allocation Models

Zhang, Xin, Sisson, Scott A.

arXiv.org Machine LearningAug-2-2016

The latent Dirichlet allocation (LDA) model is a widely-used latent variable model in machine learning for text analysis. Inference for this model typically involves a single-site collapsed Gibbs sampling step for latent variables associated with observations. The efficiency of the sampling is critical to the success of the model in practical large scale applications. In this article, we introduce a blocking scheme to the collapsed Gibbs sampler for the LDA model which can, with a theoretical guarantee, improve chain mixing efficiency. We develop two procedures, an O(K)-step backward simulation and an O(log K)-step nested simulation, to directly sample the latent variables within each block. We demonstrate that the blocking scheme achieves substantial improvements in chain mixing compared to the state of the art single-site collapsed Gibbs sampler. We also show that when the number of topics is over hundreds, the nested-simulation blocking scheme can achieve a significant reduction in computation time compared to the single-site sampler.

machine learning, natural language, sampler, (21 more...)

arXiv.org Machine Learning

1608.00945

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Combining Random Walks and Nonparametric Bayesian Topic Model for Community Detection

Zhu, Ruimin, Jiang, Wenxin

arXiv.org Machine LearningAug-2-2016

Community detection has been an active research area for decades. Among all probabilistic models, Stochastic Block Model has been the most popular one. This paper introduces a novel probabilistic model: RW-HDP, based on random walks and Hierarchical Dirichlet Process, for community extraction. In RW-HDP, random walks conducted in a social network are treated as documents; nodes are treated as words. By using Hierarchical Dirichlet Process, a nonparametric Bayesian model, we are not only able to cluster nodes into different communities, but also determine the number of communities automatically. We use Stochastic Variational Inference for our model inference, which makes our method time efficient and can be easily extended to an online learning algorithm.

data mining, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

1607.05573

Country: North America > United States > New York (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Education > Educational Setting (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

Venanzi, Matteo, Guiver, John, Kohli, Pushmeet, Jennings, Nicholas R.

Journal of Artificial Intelligence ResearchJul-28-2016

Many aspects of the design of efficient crowdsourcing processes, such as defining workers bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. In this work we introduce a new timesensitive Bayesian aggregation method that simultaneously estimates a tasks duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, uses latent variables to represent the uncertainty about the workers completion time, the tasks duration and the workers accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labelling, such as spammers, bots or lazy labellers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labelling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real- world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a tasks duration compared to stateoftheart methods.

confusion matrix, dataset, judgment, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5175

AI Access Foundation

11016

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland (0.05)
Asia > India (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Variational Mixture Models with Gamma or inverse-Gamma components

Llera, A., Vidaurre, D., Pruim, R. H. R., Beckmann, C. F.

arXiv.org Machine LearningJul-26-2016

Mixture models with Gamma and or inverse-Gamma distributed mixture components are useful for medical image tissue segmentation or as post-hoc models for regression coefficients obtained from linear regression within a Generalised Linear Modeling framework (GLM), used in this case to separate stochastic (Gaussian) noise from some kind of positive or negative "activation" (modeled as Gamma or inverse-Gamma distributed). To date, the most common choice in this context it is Gaussian/Gamma mixture models learned through a maximum likelihood (ML) approach; we recently extended such algorithm for mixture models with inverse-Gamma components. Here, we introduce a fully analytical Variational Bayes (VB) learning framework for both Gamma and/or inverse-Gamma components. We use synthetic and resting state fMRI data to compare the performance of the ML and VB algorithms in terms of area under the curve and computational cost. We observed that the ML Gaussian/Gamma model is very expensive specially when considering high resolution images; furthermore, these solutions are highly variable and they occasionally can overestimate the activations severely. The Bayesian Gauss-Gamma is in general the fastest algorithm but provides too dense solutions. The maximum likelihood Gaussian/inverse-Gamma is also very fast but provides in general very sparse solutions. The variational Gaussian/inverse-Gamma mixture model is the most robust and its cost is acceptable even for high resolution images. Further, the presented methodology represents an essential building block that can be directly used in more complex inference tasks, specially designed to analyse MRI-fMRI data; such models include for example analytical variational mixture models with adaptive spatial regularization or better source models for new spatial blind source separation approaches.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1607.07573

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Health Care Technology (0.70)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

A New PAC-Bayesian Perspective on Domain Adaptation

Germain, Pascal, Habrard, Amaury, Laviolette, François, Morvant, Emilie

arXiv.org Machine LearningJul-26-2016

We study the issue of PAC-Bayesian domain adaptation: We want to learn, from a source domain, a majority vote model dedicated to a target one. Our theoretical contribution brings a new perspective by deriving an upper-bound on the target risk where the distributions' divergence-- expressed as a ratio--controls the tradeoff between a source error measure and the target voters' disagreement. Our bound suggests that one has to focus on regions where the source data is informative. From this result, we derive a PAC-Bayesian generalization bound, and specialize it to linear classifiers. Then, we infer a learning algorithm and perform experiments on real data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1506.04573

Country:

North America (0.46)
Europe > France (0.14)

Genre:

Overview (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

maximum likelihood estimate and logistic regression simplified

#artificialintelligenceJul-25-2016, 02:03:14 GMT

Least squares regression can cause impossible estimates such as probabilities that are less than zero and greater than 1.So, when the predicted value is measured as a probability, use Logistic Regression We use the log of the odds rather than the odds directly because an odds ratio cannot be a negative number--but its log can be negative. Notice that we have randomly initialized our coefficients for income and other predictors. These will be adjusted by Solver based on a likelihood function.We will cover them later Column H tells us the predicted probability of the borrower's actual behavior, whether that behavior is repayment or default--not simply, as in Column G, the predicted probability of defaulting on the loan. One property of logarithms is that their sum equals the logarithm of the product of the numbers on which they're based The logarithms of probabilities are always negative numbers, but the closer a probability is to 1.0, the closer its logarithm is to 0.0. I haven't covered cross-validation, which is commonly used to validate a logistic regression equation.If you don't always have a large number of cases to work with, a different approach is to use statistical inference.

artificial intelligence, machine learning, probability, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

How To Use Classification Machine Learning Algorithms in Weka - Machine Learning Mastery

#artificialintelligenceJul-25-2016, 00:21:01 GMT

Weka makes a large number of classification algorithms available. The large number of machine learning algorithms available is one of the benefits of using the Weka platform to work through your machine learning problems. In this post you will discover how to use 5 top machine learning algorithms in Weka. How To Use Classification Machine Learning Algorithms in Weka Photo by Don Graham, some rights reserved. We are going to take a tour of 5 top classification algorithms in Weka.

algorithm, artificial intelligence, machine learning, (9 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Add feedback

Identifying Depression on Twitter

Nadeem, Moin

arXiv.org Machine LearningJul-25-2016

Social media has recently emerged as a premier method to disseminate information online. Through these online networks, tens of millions of individuals communicate their thoughts, personal experiences, and social ideals. We therefore explore the potential of social media to predict, even prior to onset, Major Depressive Disorder (MDD) in online personas. We employ a crowdsourced method to compile a list of Twitter users who profess to being diagnosed with depression. Using up to a year of prior social media postings, we utilize a Bag of Words approach to quantify each tweet. Lastly, we leverage several statistical classifiers to provide estimates to the risk of depression. Our work posits a new methodology for constructing our classifier by treating social as a text-classification problem, rather than a behavioral one on social media platforms. By using a corpus of 2.5M tweets, we achieved an 81% accuracy rate in classification, with a precision score of .86. We believe that this method may be helpful in developing tools that estimate the risk of an individual being depressed, can be employed by physicians, concerned individuals, and healthcare agencies to aid in diagnosis, even possibly enabling those suffering from depression to be more proactive about recovering from their mental health.

artificial intelligence, machine learning, social media, (15 more...)

arXiv.org Machine Learning

1607.07384

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback