AITopics

1904.03647

Country:

North America > United States (0.67)
Europe (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Silver, Tom, Allen, Kelsey R., Lew, Alex K., Kaelbling, Leslie Pack, Tenenbaum, Josh

Few-Shot Bayesian Imitation Learning with Logic over Programs

arXiv.org Artificial IntelligenceApr-12-2019

We describe an expressive class of policies that can be efficiently learned from a few demonstrations. Policies are represented as logical combinations of programs drawn from a small domain-specific language (DSL). We define a prior over policies with a probabilistic grammar and derive an approximate Bayesian inference algorithm to learn policies from demonstrations. In experiments, we study five strategy games played on a 2D grid with one shared DSL. After a few demonstrations of each game, the inferred policies generalize to new game instances that differ substantially from the demonstrations. We argue that the proposed method is an apt choice for policy learning tasks that have scarce training data and feature significant, structured variation between task instances.

demonstration, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1904.06317

Country: North America > United States > Massachusetts (0.46)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
(2 more...)

Wang, Yaqing, Yao, Quanming

Few-shot Learning: A Survey

arXiv.org Artificial IntelligenceApr-10-2019

The quest of `can machines think' and `can machines do what human do' are quests that drive the development of artificial intelligence. Although recent artificial intelligence succeeds in many data intensive applications, it still lacks the ability of learning from limited exemplars and fast generalizing to new tasks. To tackle this problem, one has to turn to machine learning, which supports the scientific study of artificial intelligence. Particularly, a machine learning problem called Few-Shot Learning (FSL) targets at this case. It can rapidly generalize to new tasks of limited supervised experience by turning to prior knowledge, which mimics human's ability to acquire knowledge from few examples through generalization and analogy. It has been seen as a test-bed for real artificial intelligence, a way to reduce laborious data gathering and computationally costly training, and antidote for rare cases learning. With extensive works on FSL emerging, we give a comprehensive survey for it. We first give the formal definition for FSL. Then we point out the core issues of FSL, which turns the problem from "how to solve FSL" to "how to deal with the core issues". Accordingly, existing works from the birth of FSL to the most recent published ones are categorized in a unified taxonomy, with thorough discussion of the pros and cons for different categories. Finally, we envision possible future directions for FSL in terms of problem setup, techniques, applications and theory, hoping to provide insights to both beginners and experienced researchers.

information retrieval, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

1904.05046

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > India (0.04)

Genre: Overview (1.00)

Industry:

Education (1.00)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
(3 more...)

Braunstein, Alfredo, Muntoni, Anna Paola, Pagnani, Andrea, Pieropan, Mirko

Compressed sensing reconstruction using Expectation Propagation

arXiv.org Machine LearningApr-10-2019

Many interesting problems in fields ranging from telecommunications to computational biology can be formalized in terms of large underdetermined systems of linear equations with additional constraints or regularizers. One of the most studied ones, the Compressed Sensing problem (CS), consists in finding the solution with the smallest number of non-zero components of a given system of linear equations $\boldsymbol y = \mathbf{F} \boldsymbol w$ for known measurement vector $\boldsymbol y$ and sensing matrix $\mathbf{F}$. Here, we will address the compressed sensing problem within a Bayesian inference framework where the sparsity constraint is remapped into a singular prior distribution (called Spike-and-Slab or Bernoulli-Gauss). Solution to the problem is attempted through the computation of marginal distributions via Expectation Propagation (EP), an iterative computational scheme originally developed in Statistical Physics. We will show that this strategy is comparatively more accurate than the alternatives in solving instances of CS generated from statistically correlated measurement matrices. For computational strategies based on the Bayesian framework such as variants of Belief Propagation, this is to be expected, as they implicitly rely on the hypothesis of statistical independence among the entries of the sensing matrix. Perhaps surprisingly, the method outperforms uniformly also all the other state-of-the-art methods in our tests.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1904.05777

Country:

Europe > Italy (0.14)
Europe > France (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

arXiv.org Machine LearningApr-10-2019

On the Anatomy of MCMC-based Maximum Likelihood Learning of Energy-Based Models

Nijkamp, Erik, Hill, Mitch, Han, Tian, Zhu, Song-Chun, Wu, Ying Nian

This study investigates the effects of Markov Chain Monte Carlo (MCMC) sampling in unsupervised Maximum Likelihood (ML) learning. Our attention is restricted to the family of unnormalized probability densities for which the negative log density (or energy function) is a ConvNet. In general, we find that many of the techniques used to stabilize training in previous studies can have the opposite effect. Stable ML learning with a ConvNet potential can be achieved with only a few hyper-parameters and no regularization. Using this minimal framework, we identify a variety of ML learning outcomes that depend on the implementation of MCMC sampling. On one hand, we show that it is easy to train an energy-based model which can sample realistic images with short-run Langevin. ML can be effective and stable even when MCMC samples have much higher energy than true steady-state samples throughout training. Based on this insight, we introduce an ML method with purely noise-initialized MCMC, high-quality short-run synthesis, and the same budget as ML with informative MCMC initialization such as CD or PCD. Unlike previous models, our model can obtain realistic high-diversity samples from a noise signal after training with no auxiliary networks. On the other hand, ConvNet potentials learned with highly non-convergent MCMC do not have a valid steady-state and cannot be considered approximate unnormalized densities of the training data because long-run MCMC samples differ greatly from observed images. We show that it is much harder to train a ConvNet potential to learn a steady-state over realistic images. To our knowledge, long-run MCMC samples of all previous models lose the realism of short-run samples. With correct tuning of Langevin noise, we train the first ConvNet potentials for which long-run and steady-state MCMC samples are realistic images.

artificial intelligence, initialization, machine learning, (18 more...)

1903.1237

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.84)

Benhamou, Eric, Saltiel, David, Guez, Beatrice, Paris, Nicolas

BCMA-ES II: revisiting Bayesian CMA-ES

arXiv.org Machine LearningApr-9-2019

This paper revisits the Bayesian CMA-ES and provides updates for normal Wishart. It emphasizes the difference between a normal and normal inverse Wishart prior. After some computation, we prove that the only difference relies surprisingly in the expected covariance. We prove that the expected covariance should be lower in the normal Wishart prior model because of the convexity of the inverse. We present a mixture model that generalizes both normal Wishart and normal inverse Wishart model. We finally present various numerical experiments to compare both methods as well as the generalized method.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

1904.01466

Country:

Europe (1.00)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Classification of pulsars with Dirichlet process Gaussian mixture model

Ay, F., İnce, G., Kamaşak, M. E., Ekşi, K. Y.

Young isolated neutron stars (INS) most commonly manifest themselves as rotationally powered pulsars (RPPs) which involve conventional radio pulsars as well as gamma-ray pulsars (GRPs) and rotating radio transients (RRATs). Some other young INS families manifest themselves as anomalous X-ray pulsars (AXPs) and soft gamma-ray repeaters (SGRs) which are commonly accepted as magnetars, i.e.\ magnetically powered neutron stars with decaying super-strong fields. Yet some other young INS are identified as central compact objects (CCOs) and X-ray dim isolated neutron stars (XDINs) which are cooling objects powered by their thermal energy. Older pulsars, as a result of a previous long episode of accretion from a companion, manifest themselves as millisecond pulsars and more commonly appear in binary systems. We use Dirichlet process Gaussian mixture model (DPGMM), an unsupervised machine learning algorithm, for analyzing the distribution of these pulsar families in period $P$ and period derivative $\dot{P}$ parameter space. We compare the average values of the characteristic age, magnetic dipole field strength, surface temperature and proper motion of all discovered components. We verify that DPGMM is robust and provides hints for inferring relations between different classes of pulsars. We discuss the implications of our findings for the magnetothermal spin evolution models and fallback discs.

artificial intelligence, machine learning, pulsar, (20 more...)

1904.04204

Country:

North America > United States (0.86)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(6 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Chérief-Abdellatif, Badr-Eddine, Alquier, Pierre, Khan, Mohammad Emtiyaz

A Generalization Bound for Online Variational Inference

Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even under model mismatch and with adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference? In this paper, we show that this is indeed the case for some variational inference (VI) algorithms. We propose new online, tempered VI algorithms and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that our result should hold more generally and present empirical evidence in support of this. Our work in this paper presents theoretical justifications in favor of online algorithms that rely on approximate Bayesian methods.

algorithm, approximation, inference, (14 more...)

1904.0392

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.48)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wein, Alexander S., Alaoui, Ahmed El, Moore, Cristopher

The Kikuchi Hierarchy and Tensor PCA

For the tensor PCA (principal component analysis) problem, we propose a new hierarchy of algorithms that are increasingly powerful yet require increasing runtime. Our hierarchy is analogous to the sum-of-squares (SOS) hierarchy but is instead inspired by statistical physics and related algorithms such as belief propagation and AMP (approximate message passing). Our level-$\ell$ algorithm can be thought of as a (linearized) message-passing algorithm that keeps track of $\ell$-wise dependencies among the hidden variables. Specifically, our algorithms are spectral methods based on the Kikuchi Hessian matrix, which generalizes the well-studied Bethe Hessian matrix to the higher-order Kikuchi free energies. It is known that AMP, the flagship algorithm of statistical physics, has substantially worse performance than SOS for tensor PCA. In this work we `redeem' the statistical physics approach by showing that our hierarchy gives a polynomial-time algorithm matching the performance of SOS. Our hierarchy also yields a continuum of subexponential-time algorithms, and we prove that these achieve the same (conjecturally optimal) tradeoff between runtime and statistical power as SOS. Our results hold for even-order tensors, and we conjecture that they also hold for odd-order tensors. Our methods suggest a new avenue for systematically obtaining optimal algorithms for Bayesian inference problems, and our results constitute a step toward unifying the statistical physics and sum-of-squares approaches to algorithm design.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1904.03858

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Learning Attribute Patterns in High-Dimensional Structured Latent Attribute Models

Gu, Yuqi, Xu, Gongjun

Structured latent attribute models (SLAMs) are a special family of discrete latent variable models widely used in social and biological sciences. This paper considers the problem of learning significant attribute patterns from a SLAM with potentially high-dimensional configurations of the latent attributes. We address the theoretical identifiability issue, propose a penalized likelihood method for the selection of the attribute patterns, and further establish the selection consistency in such an overfitted SLAM with diverging number of latent patterns. The good performance of the proposed methodology is illustrated by simulation studies and two real datasets in educational assessment.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1904.04378

Country: North America > United States (0.67)

Genre:

Research Report (0.49)
Workflow (0.46)

Industry:

Health & Medicine (0.92)
Education > Educational Setting > K-12 Education > Middle School (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)