AITopics | Bayesian Learning

Collaborating Authors

Bayesian Learning

A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Multi-channel discourse as an indicator for Bitcoin price and volume movements

Kennis, Marvin Aron

arXiv.org Machine LearningNov-6-2018

This research aims to identify how Bitcoin-related news publications and online discourse are expressed in Bitcoin exchange movements of price and volume. Being inherently digital, all Bitcoin-related fundamental data (from exchanges, as well as transactional data directly from the blockchain) is available online, something that is not true for traditional businesses or currencies traded on exchanges. This makes Bitcoin an interesting subject for such research, as it enables the mapping of sentiment to fundamental events that might otherwise be inaccessible. Furthermore, Bitcoin discussion largely takes place on online forums and chat channels. In stock trading, the value of sentiment data in trading decisions has been demonstrated numerous times [1] [2] [3], and this research aims to determine whether there is value in such data for Bitcoin trading models. To achieve this, data over the year 2015 has been collected from Bitcointalk.org, (the biggest Bitcoin forum in post volume), established news sources such as Bloomberg and the Wall Street Journal, the complete /r/btc and /r/Bitcoin subreddits, and the bitcoin-otc and bitcoin-dev IRC channels. By analyzing this data on sentiment and volume, we find weak to moderate correlations between forum, news, and Reddit sentiment and movements in price and volume from 1 to 5 days after the sentiment was expressed. A Granger causality test confirms the predictive causality of the sentiment on the daily percentage price and volume movements, and at the same time underscores the predictive causality of market movements on sentiment expressions in online communities

machine learning, natural language, sentiment, (21 more...)

arXiv.org Machine Learning

1811.03146

Country:

Europe (0.46)
North America > United States (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Deep Weighted Averaging Classifiers

Card, Dallas, Zhang, Michael, Smith, Noah A.

arXiv.org Machine LearningNov-6-2018

Recent advances in deep learning have achieved impressive gains in classification accuracy on a variety of types of data, including images and text. Despite these gains, however, concerns have been raised about the interpretability of these models, as well as issues related to calibration and robustness. In this paper we propose a simple way to modify any conventional deep architecture to automatically provide more transparent explanations for classification decisions, as well as an intuitive notion of the credibility of each prediction. Specifically, we draw on ideas from nonparametric kernel regression, and propose to predict labels based on a weighted sum of training instances, where the weights are determined by distance in a learned instance-embedding space. Working within the framework of conformal methods, we propose a new measure of nonconformity suggested by our model, and experimentally validate the accompanying theoretical expectations, demonstrating improved transparency, controlled error rates, and robustness to out-of-domain data, without compromising on accuracy or calibration.

artificial intelligence, machine learning, prediction, (21 more...)

arXiv.org Machine Learning

1811.02579

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
(2 more...)

Add feedback

Distributionally Robust Graphical Models

Fathony, Rizal, Rezaei, Ashkan, Bashiri, Mohammad Ali, Zhang, Xinhua, Ziebart, Brian D.

arXiv.org Artificial IntelligenceNov-6-2018

In many structured prediction problems, complex relationships between variables are compactly defined using graphical structures. The most prevalent graphical prediction methods---probabilistic graphical models and large margin methods---have their own distinct strengths but also possess significant drawbacks. Conditional random fields (CRFs) are Fisher consistent, but they do not permit integration of customized loss metrics into their learning process. Large-margin models, such as structured support vector machines (SSVMs), have the flexibility to incorporate customized loss metrics, but lack Fisher consistency guarantees. We present adversarial graphical models (AGM), a distributionally robust approach for constructing a predictor that performs robustly for a class of data distributions defined using a graphical structure. Our approach enjoys both the flexibility of incorporating customized loss metrics into its design as well as the statistical guarantee of Fisher consistency. We present exact learning and prediction algorithms for AGM with time complexity similar to existing graphical models and show the practical benefits of our approach with experiments.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1811.02728

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Concept Learning with Energy-Based Models

Mordatch, Igor

arXiv.org Artificial IntelligenceNov-6-2018

Many hallmarks of human intelligence, such as generalizing from limited experience, abstract reasoning and planning, analogical reasoning, creative problem solving, and capacity for language require the ability to consolidate experience into concepts, which act as basic building blocks of understanding and reasoning. We present a framework that defines a concept by an energy function over events in the environment, as well as an attention mask over entities participating in the event. Given few demonstration events, our method uses inference-time optimization procedure to generate events involving similar concepts or identify entities involved in the concept. We evaluate our framework on learning visual, quantitative, relational, temporal concepts from demonstration events in an unsupervised manner. Our approach is able to successfully generate and identify concepts in a few-shot setting and resulting learned concepts can be reused across environments. Example videos of our results are available at sites.google.com/site/energyconceptmodels

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Artificial Intelligence

1811.02486

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

TzK Flow - Conditional Generative Model

Livne, Micha, Fleet, David J.

arXiv.org Machine LearningNov-5-2018

We introduce TzK (pronounced "task"), a conditional flow-based encoder/decoder generative model, formulated in terms of maximum likelihood (ML). TzK offers efficient approximation of arbitrary data sample distributions (similar to GAN and flow-based ML), and stable training (similar to VAE and ML), while avoiding variational approximations (similar to ML). TzK exploits meta-data to facilitate a bottleneck, similar to autoencoders, thereby producing a low-dimensional representation. Unlike autoencoders, our bottleneck does not limit model expressiveness, similar to flow-based ML. Supervised, unsupervised, and semi-supervised learning are supported by replacing missing observations with samples from learned priors. We demonstrate TzK by jointly training on MNIST and Omniglot with minimal preprocessing, and weak supervision, with results which are comparable to state-of-the-art.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1811.01837

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Low-Rank Phase Retrieval via Variational Bayesian Learning

Liu, Kaihui, Wang, Jiayi, Xing, Zhengli, Yang, Linxiao, Fang, Jun

arXiv.org Machine LearningNov-5-2018

Abstract--In this paper, we consider the problem of low-rank phase retrieval whose objective is to estimate a complex low-rank matrix from magnitude-only measurements. We propose a hierarchical prior model for low-rank phase retrieval, in which a Gaussian-Wishart hierarchical prior is placed on the underlying low-rank matrix to promote the low-rankness of the matrix. Based on the proposed hierarchical model, a variational expectation-maximization (EM) algorithm is developed. The proposed method is less sensitive to the choice of the initialization point and works well with random initialization. Simulation results are provided to illustrate the effectiveness of the proposed algorithm.

artificial intelligence, bayesian inference, machine learning, (14 more...)

arXiv.org Machine Learning

1811.01574

Country: Asia > China (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Add feedback

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

Foerster, Jakob N., Song, Francis, Hughes, Edward, Burch, Neil, Dunning, Iain, Whiteson, Shimon, Botvinick, Matthew, Bowling, Michael

arXiv.org Artificial IntelligenceNov-4-2018

When observing the actions of others, humans carry out inferences about why the others acted as they did, and what this implies about their view of the world. Humans also use the fact that their actions will be interpreted in this manner when observed by others, allowing them to act informatively and thereby communicate efficiently with others. Although learning algorithms have recently achieved superhuman performance in a number of two-player, zero-sum games, scalable multi-agent reinforcement learning algorithms that can discover effective strategies and conventions in complex, partially observable settings have proven elusive. We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment. Together with the public belief, this Bayesian update effectively defines a new Markov decision process, the public belief MDP, in which the action space consists of deterministic partial policies, parameterised by deep neural networks, that can be sampled for a given public state. It exploits the fact that an agent acting only on this public belief state can still learn to use its private information if the action space is augmented to be over partial policies mapping private information into environment actions. The Bayesian update is also closely related to the theory of mind reasoning that humans carry out when observing others' actions. We first validate BAD on a proof-of-principle two-step matrix game, where it outperforms traditional policy gradient methods. We then evaluate BAD on the challenging, cooperative partial-information card game Hanabi, where in the two-player setting the method surpasses all previously published learning and hand-coded approaches.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

1811.01458

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Variational Bayes Inference in Digital Receivers

Tran, Viet Hung

arXiv.org Machine LearningNov-3-2018

The digital telecommunications receiver is an important context for inference methodology, the key objective being to minimize the expected loss function in recovering the transmitted information. For that criterion, the optimal decision is the Bayesian minimum-risk estimator. However, the computational load of the Bayesian estimator is often prohibitive and, hence, efficient computational schemes are required. The design of novel schemes, striking new balances between accuracy and computational load, is the primary concern of this thesis. Two popular techniques, one exact and one approximate, will be studied. The exact scheme is a recursive one, namely the generalized distributive law (GDL), whose purpose is to distribute all operators across the conditionally independent (CI) factors of the joint model, so as to reduce the total number of operators required. In a novel theorem derived in this thesis, GDL, if applicable, will be shown to guarantee such a reduction in all cases. An associated lemma also quantifies this reduction. For practical use, two novel algorithms, namely the no-longer-needed (NLN) algorithm and the generalized form of the Markovian Forward-Backward (FB) algorithm, recursively factorizes and computes the CI factors of an arbitrary model, respectively. The approximate scheme is an iterative one, namely the Variational Bayes (VB) approximation, whose purpose is to find the independent (i.e. zero-order Markov) model closest to the true joint model in the minimum Kullback-Leibler divergence (KLD) sense. Despite being computationally efficient, this naive mean field approximation confers only modest performance for highly correlated models. A novel approximation, namely Transformed Variational Bayes (TVB), will be designed in the thesis in order to relax the zero-order constraint in the VB approximation, further reducing the KLD of the optimal approximation.

deterministic approximation, télécommunications, upstream oil & gas, (25 more...)

arXiv.org Machine Learning

1811.02506

Country:

Asia (0.67)
Europe > Netherlands (0.45)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Telecommunications (1.00)
Information Technology (1.00)
Leisure & Entertainment (0.92)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
(8 more...)

Add feedback

Large-scale Heteroscedastic Regression via Gaussian Process

Liu, Haitao, Ong, Yew-Soon, Cai, Jianfei

arXiv.org Machine LearningNov-3-2018

Heteroscedastic regression which considers varying noises across input domain has many applications in fields like machine learning and statistics. Here we focus on the heteroscedastic Gaussian process (HGP) regression which integrates the latent function and the noise together in a unified non-parametric Bayesian framework. Though showing flexible and powerful performance, HGP suffers from the cubic time complexity, which strictly limits its application to big data. To improve the scalability of HGP, we first develop a variational sparse inference algorithm, named VSHGP, to handle large-scale datasets. Furthermore, to enhance the model capability of capturing quick-varying features, we follow the Bayesian committee machine (BCM) formalism to distribute the learning over multiple local VSHGP experts with many inducing points, and aggregate their predictive distributions. The proposed distributed VSHGP (DVSHGP) (i) enables large-scale HGP regression via distributed computations, and (ii) achieves high model capability via localized experts and many inducing points. Superiority of the proposed DVSHGP as compared to existing large-scale heteroscedastic/homoscedastic GPs is then verified using a synthetic dataset and three real-world datasets.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1811.01179

Genre: Research Report (0.50)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Understanding and Comparing Scalable Gaussian Process Regression for Big Data

Liu, Haitao, Cai, Jianfei, Ong, Yew-Soon, Wang, Yi

arXiv.org Machine LearningNov-3-2018

As a non-parametric Bayesian model which produces informative predictive distribution, Gaussian process (GP) has been widely used in various fields, like regression, classification and optimization. The cubic complexity of standard GP however leads to poor scalability, which poses challenges in the era of big data. Hence, various scalable GPs have been developed in the literature in order to improve the scalability while retaining desirable prediction accuracy. This paper devotes to investigating the methodological characteristics and performance of representative global and local scalable GPs including sparse approximations and local aggregations from four main perspectives: scalability, capability, controllability and robustness. The numerical experiments on two toy examples and five real-world datasets with up to 250K points offer the following findings. In terms of scalability, most of the scalable GPs own a time complexity that is linear to the training size. In terms of capability, the sparse approximations capture the long-term spatial correlations, the local aggregations capture the local patterns but suffer from over-fitting in some scenarios. In terms of controllability, we could improve the performance of sparse approximations by simply increasing the inducing size. But this is not the case for local aggregations. In terms of robustness, local aggregations are robust to various initializations of hyperparameters due to the local attention mechanism. Finally, we highlight that the proper hybrid of global and local scalable GPs may be a promising way to improve both the model capability and scalability for big data.

approximation, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

1811.01159

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback