AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Optimized Realization of Bayesian Networks in Reduced Normal Form using Latent Variable Model

Di Gennaro, Giovanni, Buonanno, Amedeo, Palmieri, Francesco A. N.

arXiv.org Machine LearningJan-18-2019

Bayesian networks in their Factor Graph Reduced Normal Form (FGrn) are a powerful paradigm for implementing inference graphs. Unfortunately, the computational and memory costs of these networks may be considerable, even for relatively small networks, and this is one of the main reasons why these structures have often been underused in practice. In this work, through a detailed algorithmic and structural analysis, various solutions for cost reduction are proposed. An online version of the classic batch learning algorithm is also analyzed, showing very similar results (in an unsupervised context); which is essential even if multilevel structures are to be built. The solutions proposed, together with the possible online learning algorithm, are included in a C++ library that is quite efficient, especially if compared to the direct use of the well-known sum-product and Maximum Likelihood (ML) algorithms. The results are discussed with particular reference to a Latent Variable Model (LVM) structure.

algorithm, siso block, vector, (13 more...)

arXiv.org Machine Learning

1901.06201

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Deep Learning Finds Fake News with 97% Accuracy

#artificialintelligenceJan-17-2019, 21:17:07 GMT

That means the pooling layer computes a feature vector of size 128 which is passed into dense layers of the feedforward network as we mentioned above. The overall structure of the DNN can be understood as a preprocessor defined in the first part that is being trained to map text sequences into feature vectors in such a way that the weights of the second part can be trained to obtain optimal classification results from the overall network. More details on the implementation and text preprocessing can be found in my GitHub repository for this project. I trained this network for 10 epochs with a batch size of 128 using an 80-20 training/hold-out set. A couple of notes on additional parameters: The vast majority of documents in this collection is of length 5000 or less. So for the maximum input sequence length for the DNN I chose 5000 words. There are roughly 100,000 unique words in this collection of documents. I arbitrarily limited the dictionary that the DNN can learn to 25% of that: 25,000 words. Finally, for the embedding dimension, I chose 300 simply because that is the default embedding dimension for both word2vec and GloVe.

artificial intelligence, machine learning, vector, (17 more...)

#artificialintelligence

Industry: Media > News (0.44)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

Add feedback

Probabilistic symmetry and invariant neural networks

Bloem-Reddy, Benjamin, Teh, Yee Whye

arXiv.org Machine LearningJan-17-2019

In an effort to improve the performance of deep neural networks in data-scarce, non-i.i.d., or unsupervised settings, much recent research has been devoted to encoding invariance under symmetry transformations into neural network architectures. We treat the neural network input and output as random variables, and consider group invariance from the perspective of probabilistic symmetry. Drawing on tools from probability and statistics, we establish a link between functional and probabilistic symmetry, and obtain generative functional representations of joint and conditional probability distributions that are invariant or equivariant under the action of a compact group. Those representations completely characterize the structure of neural networks that can be used to model such distributions and yield a general program for constructing invariant stochastic or deterministic neural networks. We develop the details of the general program for exchangeable sequences and arrays, recovering a number of recent examples as special cases.

conditional distribution, invariant, sequence, (16 more...)

arXiv.org Machine Learning

1901.06082

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Applying SVGD to Bayesian Neural Networks for Cyclical Time-Series Prediction and Inference

Hu, Xinyu, Szerlip, Paul, Karaletsos, Theofanis, Singh, Rohit

arXiv.org Machine LearningJan-17-2019

A regression-based BNN model is proposed to predict spatiotemporal quantities like hourly rider demand with calibrated uncertainties. The main contributions of this paper are (i) A feed-forward deterministic neural network (DetNN) architecture that predicts cyclical time series data with sensitivity to anomalous forecasting events; (ii) A Bayesian framework applying SVGD to train large neural networks for such tasks, capable of producing time series predictions as well as measures of uncertainty surrounding the predictions. Experiments show that the proposed BNN reduces average estimation error by 10% across 8 U.S. cities compared to a fine-tuned multilayer perceptron (MLP), and 4% better than the same network architecture trained without SVGD.

neural network, prediction variability, svgd, (9 more...)

arXiv.org Machine Learning

1901.05906

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Theory of Minds: Understanding Behavior in Groups Through Inverse Planning

Shum, Michael, Kleiman-Weiner, Max, Littman, Michael L., Tenenbaum, Joshua B.

arXiv.org Artificial IntelligenceJan-17-2019

Human social behavior is structured by relationships. We form teams, groups, tribes, and alliances at all scales of human life. These structures guide multi-agent cooperation and competition, but when we observe others these underlying relationships are typically unobservable and hence must be inferred. Humans make these inferences intuitively and flexibly, often making rapid generalizations about the latent relationships that underlie behavior from just sparse and noisy observations. Rapid and accurate inferences are important for determining who to cooperate with, who to compete with, and how to cooperate in order to compete. Towards the goal of building machine-learning algorithms with human-like social intelligence, we develop a generative model of multi-agent action understanding based on a novel representation for these latent relationships called Composable Team Hierarchies (CTH). This representation is grounded in the formalism of stochastic games and multi-agent reinforcement learning. We use CTH as a target for Bayesian inference yielding a new algorithm for understanding behavior in groups that can both infer hidden relationships as well as predict future actions for multiple agents interacting together. Our algorithm rapidly recovers an underlying causal model of how agents relate in spatial stochastic games from just a few observations. The patterns of inference made by this algorithm closely correspond with human judgments and the algorithm makes the same rapid generalizations that people do.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1901.06085

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Implementing Naive Bayes for Sentiment Analysis in Python

#artificialintelligenceJan-16-2019, 08:32:38 GMT

The Naive Bayes Classifier is a well known machine learning classifier with applications in Natural Language Processing (NLP) and other areas. Despite its simplicity, it is able to achieve above average performance in different tasks like sentiment analysis. Today we will elaborate on the core principles of this model and then implement it in Python. In the end, we will see how well we do on a dataset of 2000 movie reviews. The math behind this model isn't particularly difficult to understand if you are familiar with some of the math notation.

artificial intelligence, machine learning, natural language, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A Primer on PAC-Bayesian Learning

Guedj, Benjamin

arXiv.org Machine LearningJan-16-2019

Generalized Bayesian learning algorithms are increasingly popular in machine learning, due to their PAC generalization properties and flexibility. The present paper aims at providing a self-contained survey on the resulting PAC-Bayes framework and some of its main theoretical and algorithmic developments.

algorithm, bayesian, pac-bayes, (14 more...)

arXiv.org Machine Learning

1901.05353

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Overview (0.65)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

A review of single-source unsupervised domain adaptation

Kouw, Wouter M., Loog, Marco

arXiv.org Machine LearningJan-16-2019

Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the questions: when and how a classifier can learn from a source domain and generalize to a target domain. As for when, we review conditions that allow for cross-domain generalization error bounds. As for how, we present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods focus on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods focus on alternative estimators, such as robust, minimax or Bayesian. Our categorization highlights recurring ideas and raises a number of questions important to further research.

adaptation, classifier, domain adaptation, (12 more...)

arXiv.org Machine Learning

1901.05335

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Europe > Hungary > Budapest > Budapest (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
(6 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Biomedical Informatics (1.00)
(7 more...)

Add feedback

Cost Sensitive Learning in the Presence of Symmetric Label Noise

Tripathi, Sandhya, Hemachandra, N.

arXiv.org Machine LearningJan-16-2019

In binary classification framework, we are interested in making cost sensitive label predictions in the presence of uniform/symmetric label noise. We first observe that $0$-$1$ Bayes classifiers are not (uniform) noise robust in cost sensitive setting. To circumvent this impossibility result, we present two schemes; unlike the existing methods, our schemes do not require noise rate. The first one uses $\alpha$-weighted $\gamma$-uneven margin squared loss function, $l_{\alpha, usq}$, which can handle cost sensitivity arising due to domain requirement (using user given $\alpha$) or class imbalance (by tuning $\gamma$) or both. However, we observe that $l_{\alpha, usq}$ Bayes classifiers are also not cost sensitive and noise robust. We show that regularized ERM of this loss function over the class of linear classifiers yields a cost sensitive uniform noise robust classifier as a solution of a system of linear equations. We also provide a performance bound for this classifier. The second scheme that we propose is a re-sampling based scheme that exploits the special structure of the uniform noise models and uses in-class probability estimates. Our computational experiments on some UCI datasets with class imbalance show that classifiers of our two schemes are on par with the existing methods and in fact better in some cases w.r.t. Accuracy and Arithmetic Mean, without using/tuning noise rate. We also consider other cost sensitive performance measures viz., F measure and Weighted Cost for evaluation.

algorithm, classifier, usq, (15 more...)

arXiv.org Machine Learning

1901.02271

Country: Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Add feedback

Exploiting Synchronized Lyrics And Vocal Features For Music Emotion Detection

Parisi, Loreto, Francia, Simone, Olivastri, Silvio, Tavella, Maria Stella

arXiv.org Artificial IntelligenceJan-15-2019

Support Vector Machines are employed engaging playlists according to sentiment and with good results also for multilabel classification [30], emotions. While previous works were mostly based more recently also Convolutional Neural Networks were on audio for music discovery and playlists generation, used in this field [45]. Lyrics-based approaches, on the we take advantage of our synchronized lyrics dataset other hand, make use of Recurrent Neural Networks architectures to combine text representations and music features in (like LSTM [13]) for performing text classification a novel way; we therefore introduce the Synchronized [46, 47]. The idea of using lyrics combined with Lyrics Emotion Dataset. Unlike other approaches that voice only audio signals is done in [29], where emotion randomly exploited the audio samples and the whole recognition is performed by using textual and speech data, text, our data is split according to the temporal information instead of visual ones. Measuring and assigning emotions provided by the synchronization between lyrics to music is not a straightforward task: the sentiment/mood and audio. This work shows a comparison between associated with a song can be derived by a combination of text-based and audio-based deep learning classification many features, moreover, emotions expressed by a musical models using different techniques from Natural Language excerpt and by its corresponding lyrics do not always Processing and Music Information Retrieval domains.

artificial intelligence, classification, machine learning, (19 more...)

arXiv.org Artificial Intelligence

1901.04831

Genre: Research Report (0.64)

Industry:

Media > Music (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback