AITopics

1911.0422

Country:

North America > United States > Illinois (0.04)
North America > Canada (0.04)
Asia > Malaysia (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Zanette, Andrea, Brunskill, Emma

Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs

arXiv.org Artificial IntelligenceNov-3-2019

In order to make good decision under uncertainty an agent must learn from observations. To do so, two of the most common frameworks are Contextual Bandits and Markov Decision Processes (MDPs). In this paper, we study whether there exist algorithms for the more general framework (MDP) which automatically provide the best performance bounds for the specific problem at hand without user intervention and without modifying the algorithm. In particular, it is found that a very minor variant of a recently proposed reinforcement learning algorithm for MDPs already matches the best possible regret bound $\tilde O (\sqrt{SAT})$ in the dominant term if deployed on a tabular Contextual Bandit problem despite the agent being agnostic to such setting.

algorithm, failure event, identifying bandit structure, (14 more...)

1911.00954

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Machine LearningNov-2-2019

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo

Wang, Bao, Zou, Difan, Gu, Quanquan, Osher, Stanley

As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, SGLD typically suffers from slow convergence rate due to its large variance caused by the stochastic gradient. In order to alleviate these drawbacks, we leverage the recently developed Laplacian Smoothing (LS) technique and propose a Laplacian smoothing stochastic gradient Langevin dynamics (LS-SGLD) algorithm. We prove that for sampling from both log-concave and non-log-concave densities, LS-SGLD achieves strictly smaller discretization error in $2$-Wasserstein distance, although its mixing rate can be slightly slower. Experiments on both synthetic and real datasets verify our theoretical results, and demonstrate the superior performance of LS-SGLD on different machine learning tasks including posterior sampling, Bayesian logistic regression and training Bayesian convolutional neural networks. The code is available at \url{https://github.com/BaoWangMath/LS-MCMC}.

algorithm, inequality, laplacian, (17 more...)

1911.00782

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Das, Neha, Karl, Maximilian, Becker-Ehmck, Philip, van der Smagt, Patrick

Beta DVBF: Learning State-Space Models for Control from High Dimensional Observations

arXiv.org Machine LearningNov-2-2019

Philip Becker-Ehmck philip.becker-ehmck@argmax.ai Patrick van der Smagt Disclosure: Parts of this work have been submitted in form of a Master's Thesis towards partial fulfillment of the requirements for a Masters program at the Technical University of Munich [4] Abstract Learning a model of dynamics from high-dimensional images can be a core ingredient for success in many applications across different domains, especially in sequential decision making. However, currently prevailing methods based on latent-variable models are limited to working with low resolution images only. In this work, we show that some of the issues with using high-dimensional observations arise from the discrepancy between the dimensionality of the latent and observable space, and propose solutions to overcome them. 1 Introduction Learning a probabilistic model for sequential data is a key step towards solving a lot of interesting problems, including analysis and deconstruction of auditory sequences [18], predicting the next piece of information given previously recorded data such as video frames [9] and text [20], and controlling an agent to perform specific tasks (model-based reinforcement learning [5]). While in this paper, we consider probabilistic models especially tuned for the needs of the last, i.e with control inputs, the approaches we discuss can be applied to control-less environments as well. A key feature required in control-based models is that they should be able to generate a feasible trajectory distribution given a control policy. To this end, one of the more successful solutions proposed in the past for modeling dynamical systems is Deep V ariational Bayes Filter ( DVBF) [10] - a framework for learning a State-Space Model given sequential observations from the environment in an unsupervised manner.

dense layer, generative model, sequence, (13 more...)

1911.00756

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.24)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Yurochkin, Mikhail, Agarwal, Mayank, Ghosh, Soumya, Greenewald, Kristjan, Hoang, Trong Nghia

Statistical Model Aggregation via Parameter Matching

arXiv.org Machine LearningNov-1-2019

We consider the problem of aggregating models learned from sequestered, possibly heterogeneous datasets. Exploiting tools from Bayesian nonparametrics, we develop a general meta-modeling framework that learns shared global latent structures by identifying correspondences among local model parameterizations. Our proposed framework is model-independent and is applicable to a wide range of model types. After verifying our approach on simulated data, we demonstrate its utility in aggregating Gaussian topic models, hierarchical Dirichlet process based hidden Markov models, and sparse Gaussian processes with applications spanning text summarization, motion capture analysis, and temperature forecasting.

global parameter, local parameter, sp ahm, (14 more...)

1911.00218

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)

Genre: Research Report (0.82)

Industry:

Government (0.93)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

arXiv.org Artificial IntelligenceNov-1-2019

Generalized Mean Estimation in Monte-Carlo Tree Search

Dam, Tuan, Klink, Pascal, D'Eramo, Carlo, Peters, Jan, Pajarinen, Joni

We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Moreover, we discuss a heuristic approach to balance the greediness of backups by tuning the power mean operator according to the number of visits to each node. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. UCT.

algorithm, node, power-uct, (15 more...)

1911.00384

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)

Sahu, Saurabh, Gupta, Rahul, Espy-Wilson, Carol

Modeling Feature Representations for Affective Speech using Generative Adversarial Networks

arXiv.org Machine LearningOct-31-2019

Emotion recognition is a classic field of research with a typical setup extracting features and feeding them through a classifier for prediction. On the other hand, generative models jointly capture the distributional relationship between emotions and the feature profiles. Relatively recently, Generative Adversarial Networks (GANs) have surfaced as a new class of generative models and have shown considerable success in modeling distributions in the fields of computer vision and natural language understanding. In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior. Each mixture component corresponds to an emotional class and can be sampled to generate features from the corresponding emotion. (ii) A one-hot vector corresponding to an emotion can be explicitly used to generate the features. We perform analysis on such models and also propose different metrics used to measure the performance of the GAN models in their ability to generate realistic synthetic samples. Apart from evaluation on a given dataset of interest, we perform a cross-corpus study where we study the utility of the synthetic samples as additional training data in low resource conditions.

feature vector, synthetic data, vector, (16 more...)

1911.0003

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > India > West Bengal > Kharagpur (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre:

Personal (0.93)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Fang, Tiantian, Schwing, Alexander G.

Co-Generation with GANs using AIS based HMC

arXiv.org Machine LearningOct-31-2019

Inferring the most likely configuration for a subset of variables of a joint distribution given the remaining ones - which we refer to as co-generation - is an important challenge that is computationally demanding for all but the simplest settings. This task has received a considerable amount of attention, particularly for classical ways of modeling distributions like structured prediction. In contrast, almost nothing is known about this task when considering recently proposed techniques for modeling high-dimensional distributions, particularly generative adversarial nets (GANs). Therefore, in this paper, we study the occurring challenges for co-generation with GANs. To address those challenges we develop an annealed importance sampling based Hamiltonian Monte Carlo co-generation algorithm. The presented approach significantly outperforms classical gradient based methods on a synthetic and on the CelebA and LSUN datasets.

generator, initialization, proc, (15 more...)

1910.14673

Country:

North America > United States > Illinois (0.05)
North America > Canada (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Bojchevski, Aleksandar, Günnemann, Stephan

Certifiable Robustness to Graph Perturbations

arXiv.org Machine LearningOct-31-2019

Despite the exploding interest in graph neural networks there has been little effort to verify and improve their robustness. This is even more alarming given recent findings showing that they are extremely vulnerable to adversarial attacks on both the graph structure and the node attributes. We propose the first method for verifying certifiable (non-)robustness to graph perturbations for a general class of models that includes graph neural networks and label/feature propagation. By exploiting connections to PageRank and Markov decision processes our certificates can be efficiently (and under many threat models exactly) computed. Furthermore, we investigate robust training procedures that increase the number of certifiably robust nodes while maintaining or improving the clean predictive accuracy.

budget, constraint, node, (17 more...)

1910.14356

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Government (0.67)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Decelle, Aurélien, Furtlehner, Cyril

Gaussian-Spherical Restricted Boltzmann Machines

arXiv.org Artificial IntelligenceOct-31-2019

We consider a special type of Restricted Boltzmann machine (RBM), namely a Gaussian-spherical RBM where the visible units have Gaussian priors while the vector of hidden variables is constrained to stay on an ${\mathbbm L}_2$ sphere. The spherical constraint having the advantage to admit exact asymptotic treatments, various scaling regimes are explicitly identified based solely on the spectral properties of the coupling matrix (also called weight matrix of the RBM). Incidentally these happen to be formally related to similar scaling behaviours obtained in a different context dealing with spatial condensation of zero range processes. More specifically, when the spectrum of the coupling matrix is doubly degenerated an exact treatment can be proposed to deal with finite size effects. Interestingly the known parallel between the ferromagnetic transition of the spherical model and the Bose-Einstein condensation can be made explicit in that case. More importantly this gives us the ability to extract all needed response functions with arbitrary precision for the training algorithm of the RBM. This allows us then to numerically integrate the dynamics of the spectrum of the weight matrix during learning in a precise way. This dynamics reveals in particular a sequential emergence of modes from the Marchenko-Pastur bulk of singular vectors of the coupling matrix.

artificial intelligence, boltzmann machine, machine learning, (18 more...)

doi: 10.1088/1751-8121/ab79f3

1910.14544

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)