AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Neural Model-Based Reinforcement Learning for Recommendation

Chen, Xinshi, Li, Shuang, Li, Hui, Jiang, Shaohua, Qi, Yuan, Song, Le

arXiv.org Machine LearningDec-26-2018

There are great interests as well as many challenges in applying reinforcement learning (RL) to recommendation systems. In this setting, an online user is the environment; neither the reward function nor the environment dynamics are clearly defined, making the application of RL challenging. In this paper, we propose a novel model-based reinforcement learning framework for recommendation systems, where we develop a generative adversarial network to imitate user behavior dynamics and learn her reward function. Using this user model as the simulation environment, we develop a novel DQN algorithm to obtain a combinatorial recommendation policy which can handle a large number of candidate items efficiently. In our experiments with real data, we show this generative adversarial user model can better explain user behavior than alternatives, and the RL policy based on this model can lead to a better long-term reward for the user and higher click rate for the system.

recommendation system, reward function, user model, (14 more...)

arXiv.org Machine Learning

1812.10613

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games (0.48)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

BlinkML: Efficient Maximum Likelihood Estimation with Probabilistic Guarantees

Park, Yongjoo, Qing, Jingyi, Shen, Xiaoyang, Mozafari, Barzan

arXiv.org Machine LearningDec-26-2018

The rising volume of datasets has made training machine learning (ML) models a major computational cost in the enterprise. Given the iterative nature of model and parameter tuning, many analysts use a small sample of their entire data during their initial stage of analysis to make quick decisions (e.g., what features or hyperparameters to use) and use the entire dataset only in later stages (i.e., when they have converged to a specific model). This sampling, however, is performed in an ad-hoc fashion. Most practitioners cannot precisely capture the effect of sampling on the quality of their model, and eventually on their decision-making process during the tuning phase. Moreover, without systematic support for sampling operators, many optimizations and reuse opportunities are lost. In this paper, we introduce BlinkML, a system for fast, quality-guaranteed ML training. BlinkML allows users to make error-computation tradeoffs: instead of training a model on their full data (i.e., full model), BlinkML can quickly train an approximate model with quality guarantees using a sample. The quality guarantees ensure that, with high probability, the approximate model makes the same predictions as the full model. BlinkML currently supports any ML model that relies on maximum likelihood estimation (MLE), which includes Generalized Linear Models (e.g., linear regression, logistic regression, max entropy classifier, Poisson regression) as well as PPCA (Probabilistic Principal Component Analysis). Our experiments show that BlinkML can speed up the training of large-scale ML tasks by 6.26x-629x while guaranteeing the same predictions, with 95% probability, as the full model.

accuracy, approximate model, blinkml, (16 more...)

arXiv.org Machine Learning

doi: 10.1145/3299869.3300077

1812.10564

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.05)
(4 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Generalized Score Matching for Non-Negative Data

Yu, Shiqing, Drton, Mathias, Shojaie, Ali

arXiv.org Machine LearningDec-26-2018

A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyv\"arinen [2005] avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over $\mathbb{R}^m$. Hyv\"arinen [2007] extended the approach to distributions supported on the non-negative orthant, $\mathbb{R}_+^m$. In this paper, we give a generalized form of score matching for non-negative data that improves estimation efficiency. As an example, we consider a general class of pairwise interaction models. Addressing an overlooked inexistence problem, we generalize the regularized score matching method of Lin et al. [2016] and improve its theoretical guarantees for non-negative Gaussian graphical models.

auc 0, estimator, truncation point power 0, (14 more...)

arXiv.org Machine Learning

1812.10551

Country:

North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(2 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Solving the Empirical Bayes Normal Means Problem with Correlated Noise

Sun, Lei, Stephens, Matthew

arXiv.org Machine LearningDec-24-2018

The Normal Means problem plays a fundamental role in many areas of modern high-dimensional statistics, both in theory and practice. And the Empirical Bayes (EB) approach to solving this problem has been shown to be highly effective, again both in theory and practice. However, almost all EB treatments of the Normal Means problem assume that the observations are independent. In practice correlations are ubiquitous in real-world applications, and these correlations can grossly distort EB estimates. Here, exploiting theory from Schwartzman (2010), we develop new EB methods for solving the Normal Means problem that take account of unknown correlations among observations. We provide practical software implementations of these methods, and illustrate them in the context of large-scale multiple testing problems and False Discovery Rate (FDR) control. In realistic numerical experiments our methods compare favorably with other commonly-used multiple testing methods.

correlation, empirical distribution, statistics, (14 more...)

arXiv.org Machine Learning

1812.07488

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Greenland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)
Health & Medicine > Therapeutic Area > Hematology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

High-Dimensional Poisson DAG Model Learning Using $\ell_1$-Regularized Regression

Park, Gunwoong, Park, Sion

arXiv.org Machine LearningDec-24-2018

In this paper, we develop a new approach to learning high-dimensional Poisson directed acyclic graphical (DAG) models from only observational data without strong assumptions such as faithfulness and strong sparsity. A key component of our method is to decouple the ordering estimation or parent search where the problems can be efficiently addressed using $\ell_1$-regularized regression and the mean-variance relationship. We show that sample size $n = \Omega( d^{2} \log^{9} p)$ is sufficient for our polynomial time Mean-variance Ratio Scoring (MRS) algorithm to recover the true directed graph, where $p$ is the number of nodes and $d$ is the maximum indegree. We verify through simulations that our algorithm is statistically consistent in the high-dimensional $p>n$ setting, and performs well compared to state-of-the-art ODS, GES, and MMHC algorithms. We also demonstrate through multivariate real count data that our MRS algorithm is well-suited to estimating DAG models for multivariate count data in comparison to other methods used for discrete data.

algorithm, graph, poisson dag model, (14 more...)

arXiv.org Machine Learning

1810.02501

Country:

North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > Michigan (0.04)
North America > United States > Florida > Monroe County > Key West (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Sports > Baseball (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Inference in Graded Bayesian Networks

Leppert, Robert, Zimmermann, Karl-Heinz

arXiv.org Machine LearningDec-23-2018

Machine learning provides algorithms that can learn from data and make inferences or predictions on data. Bayesian networks are a class of graphical models that allow to represent a collection of random variables and their condititional dependencies by directed acyclic graphs. In this paper, an inference algorithm for the hidden random variables of a Bayesian network is given by using the tropicalization of the marginal distribution of the observed variables. By restricting the topological structure to graded networks, an inference algorithm for graded Bayesian networks will be established that evaluates the hidden random variables rank by rank and in this way yields the most probable states of the hidden variables. This algorithm can be viewed as a generalized version of the Viterbi algorithm for graded Bayesian networks.

algorithm, bayesian network, inference algorithm, (13 more...)

arXiv.org Machine Learning

1901.01837

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Sweden > Östergötland County > Linköping (0.05)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Computations in Stochastic Acceptors

Zimmermann, Karl-Heinz

arXiv.org Machine LearningDec-23-2018

Machine learning provides algorithms that can learn from data and make inferences or predictions on data. Stochastic acceptors or probabilistic automata are stochastic automata without output that can model components in machine learning scenarios. In this paper, we provide dynamic programming algorithms for the computation of input marginals and the acceptance probabilities in stochastic acceptors. Furthermore, we specify an algorithm for the parameter estimation of the conditional probabilities using the expectation-maximization technique and a more efficient implementation related to the Baum-Welch algorithm.

algorithm, probability, stochastic acceptor, (14 more...)

arXiv.org Machine Learning

1812.09687

Country:

Europe > Germany > Hamburg (0.04)
North America > United States > New York > Nassau County > Mineola (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

Add feedback

Universal Supervised Learning for Individual Data

Fogel, Yaniv, Feder, Meir

arXiv.org Machine LearningDec-22-2018

Universal supervised learning is considered from an information theoretic point of view following the universal prediction approach, see Merhav and Feder (1998). We consider the standard supervised "batch" learning where prediction is done on a test sample once the entire training data is observed, and the individual setting where the features and labels, both in the training and test, are specific individual quantities. The information theoretic approach naturally uses the self-information loss or log-loss. Our results provide universal learning schemes that compete with a "genie" (or reference) that knows the true test label. In particular, it is demonstrated that the main proposed scheme, termed Predictive Normalized Maximum Likelihood (pNML), is a robust learning solution that outperforms the current leading approach based on Empirical Risk Minimization (ERM). Furthermore, the pNML construction provides a pointwise indication for the learnability of the specific test challenge with the given training examples

learner, pnml, prediction, (14 more...)

arXiv.org Machine Learning

1812.0952

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre: Research Report (0.84)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Add feedback

Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching

Yeh, Chih-Kuan, Chen, Jianshu, Yu, Chengzhu, Yu, Dong

arXiv.org Machine LearningDec-22-2018

We consider the problem of training speech recognition systems without using any labeled data, under the assumption that the learner can only access to the input utterances and a phoneme language model estimated from a non-overlapping corpus. We propose a fully unsupervised learning algorithm that alternates between solving two sub-problems: (i) learn a phoneme classifier for a given set of phoneme segmentation boundaries, and (ii) refining the phoneme boundaries based on a given classifier. To solve the first sub-problem, we introduce a novel unsupervised cost function named Segmental Empirical Output Distribution Matching, which generalizes the work in (Liu et al., 2017) to segmental structures. For the second sub-problem, we develop an approximate MAP approach to refining the boundaries obtained from Wang et al. (2017). Experimental results on TIMIT dataset demonstrate the success of this fully unsupervised phoneme recognition system, which achieves a phone error rate (PER) of 41.6%. Although it is still far away from the state-of-the-art supervised systems, we show that with oracle boundaries and matching language model, the PER could be improved to 32.5%.This performance approaches the supervised system of the same model architecture, demonstrating the great potential of the proposed method.

boundary, language model, speech recognition, (14 more...)

arXiv.org Machine Learning

1812.09323

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Washington > King County > Bellevue (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Ecological Data Analysis Based on Machine Learning Algorithms

Siraj-Ud-Doula, Md., Alam, Md. Ashad

arXiv.org Machine LearningDec-21-2018

Abstract: Classification is an important supervised machine learning method, which is necessary and challenging issue for ecological research. It offers a way to classify a dataset into subsets that share common patterns. Notably, there are many classification algorithms to choose from, each making certain assumptions about the data and about how classification should be formed. In this paper, we applied eight machine learning classification algorithms such as Decision Trees, Random Forest, Artificial Neural Network, Support Vector Machine, Linear Discriminant Analysis, k-nearest neighbors, Logistic Regression and Naive Bayes on ecological data. The goal of this study is to compare different machine learning classification algorithms in ecological dataset. In this analysis we have checked the accuracy test among the algorithms. In our study we conclude that Linear Discriminant Analysis and k-nearest neighbors are the best methods among all other methods.

algorithm, classification, classification algorithm, (13 more...)

arXiv.org Machine Learning

1812.09138

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.04)
Europe > United Kingdom > England > West Sussex (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback