AITopics

2006.0246

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Asia > Middle East > Iraq (0.04)
North America > United States > New York (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Wehenkel, Antoine, Louppe, Gilles

You say Normalizing Flows I see Bayesian Networks

arXiv.org Machine LearningJun-3-2020

Normalizing flows have emerged as an important family of deep neural networks for modelling complex probability distributions. In this note, we revisit their coupling and autoregressive transformation layers as probabilistic graphical models and show that they reduce to Bayesian networks with a pre-defined topology and a learnable density at each node. From this new perspective, we provide three results. First, we show that stacking multiple transformations in a normalizing flow relaxes independence assumptions and entangles the model distribution. Second, we show that a fundamental leap of capacity emerges when the depth of affine flows exceeds 3 transformation layers. Third, we prove the non-universality of the affine normalizing flow, regardless of its depth.

artificial intelligence, conditioner, machine learning, (17 more...)

2006.00866

Country: Europe > Belgium (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)

arXiv.org Machine LearningJun-3-2020

Adaptive quadrature schemes for Bayesian inference via active learning

Llorente, F., Martino, L., Elvira, V., Delgado, D., López-Santiago, J.

Numerical integration and emulation are fundamental topics across scientific fields. We propose novel adaptive quadrature schemes based on an active learning procedure. We consider an interpolative approach for building a surrogate posterior density, combining it with Monte Carlo sampling methods and other quadrature rules. The nodes of the quadrature are sequentially chosen by maximizing a suitable acquisition function, which takes into account the current approximation of the posterior and the positions of the nodes. This maximization does not require additional evaluations of the true posterior. We introduce two specific schemes based on Gaussian and Nearest Neighbors (NN) bases. For the Gaussian case, we also provide a novel procedure for fitting the bandwidth parameter, in order to build a suitable emulator of a density function. With both techniques, we always obtain a positive estimation of the marginal likelihood (a.k.a., Bayesian evidence). An equivalent importance sampling interpretation is also described, which allows the design of extended schemes. Several theoretical results are provided and discussed. Numerical results show the advantage of the proposed approach, including a challenging inference problem in an astronomic dynamical model, with the goal of revealing the number of planets orbiting a star.

artificial intelligence, machine learning, node, (19 more...)

2006.00535

Country:

Europe > Spain > Galicia > Madrid (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.64)

Serdega, Andriy, Kim, Dae-Shik

Variational Mutual Information Maximization Framework for VAE Latent Codes with Continuous and Discrete Priors

Learning interpretable and disentangled representations of data is a key topic in machine learning research. Variational Autoencoder (VAE) is a scalable method for learning directed latent variable models of complex data. It employs a clear and interpretable objective that can be easily optimized. However, this objective does not provide an explicit measure for the quality of latent variable representations which may result in their poor quality. We propose Variational Mutual Information Maximization Framework for VAE to address this issue. In comparison to other methods, it provides an explicit objective that maximizes lower bound on mutual information between latent codes and observations. The objective acts as a regularizer that forces VAE to not ignore the latent variable and allows one to select particular components of it to be most informative with respect to the observations. On top of that, the proposed framework provides a way to evaluate mutual information between latent codes and observations for a fixed VAE model. We have conducted our experiments on VAE models with Gaussian and joint Gaussian and discrete latent variables. Our results illustrate that the proposed approach strengthens relationships between latent codes and observations and improves learned representations.

artificial intelligence, machine learning, representation, (17 more...)

doi: 10.13140/RG.2.2.30511.15523

2006.02227

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Rixner, Maximilian, Koutsourelakis, Phaedon-Stelios

A probabilistic generative model for semi-supervised training of coarse-grained surrogates and enforcing physical constraints through virtual observables

The data-centric construction of inexpensive surrogates for fine-grained, physical models has been at the forefront of computational physics due to its significant utility in many-query tasks such as uncertainty quantification. Recent efforts have taken advantage of the enabling technologies from the field of machine learning (e.g. deep neural networks) in combination with simulation data. While such strategies have shown promise even in higher-dimensional problems, they generally require large amounts of training data even though the construction of surrogates is by definition a Small Data problem. Rather than employing data-based loss functions, it has been proposed to make use of the governing equations (in the simplest case at collocation points) in order to imbue domain knowledge in the training of the otherwise black-box-like interpolators. The present paper provides a flexible, probabilistic framework that accounts for physical structure and information both in the training objectives as well as in the surrogate model itself. We advocate a probabilistic (Bayesian) model in which equalities that are available from the physics (e.g. residuals, conservation laws) can be introduced as virtual observables and can provide additional information through the likelihood. We further advocate a generative model i.e. one that attempts to learn the joint density of inputs and outputs that is capable of making use of unlabeled data (i.e. only inputs) in a semi-supervised fashion in order to promote the discovery of lower-dimensional embeddings which are nevertheless predictive of the fine-grained model's output.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2006.01789

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Toward Optimal Probabilistic Active Learning Using a Bayesian Approach

Kottke, Daniel, Herde, Marek, Sandrock, Christoph, Huseljic, Denis, Krempl, Georg, Sick, Bernhard

Gathering labeled data to train well-performing machine learning models is one of the critical challenges in many applications. Active learning aims at reducing the labeling costs by an efficient and effective allocation of costly labeling resources. In this article, we propose a decision-theoretic selection strategy that (1) directly optimizes the gain in misclassification error, and (2) uses a Bayesian approach by introducing a conjugate prior distribution to determine the class posterior to deal with uncertainties. By reformulating existing selection strategies within our proposed model, we can explain which aspects are not covered in current state-of-the-art and why this leads to the superior performance of our approach. Extensive experiments on a large variety of datasets and different kernels validate our claims.

artificial intelligence, machine learning, xpal, (12 more...)

2006.01732

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)
North America > Canada > British Columbia (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)

Maeda, Shin-ichi, Nakanishi, Toshiki, Koyama, Masanori

Meta Learning as Bayes Risk Minimization

We show that, when we cast meta-learning problem as BRM, the optimal solution Meta-Learning is a family of methods that use is given by the predictive distribution computed from a set of interrelated tasks to learn a model that the posterior distribution of the latent variable conditioned can quickly learn a new query task from a possibly against the contextual dataset. This result justifies the use of small contextual dataset. In this study, we the predictive distribution in many previous studies of meta use a probabilistic framework to formalize what learning, such as (Edwards & Storkey, 2017; Gordon et al., it means for two tasks to be related and reframe 2018; Garnelo et al., 2018). However, the optimality of the the meta-learning problem into the problem of predictive distribution cannot be guaranteed if one uses an Bayesian risk minimization (BRM). In our formulation, approximation of the posterior distribution that violates the the BRM optimal solution is given by the way the posterior distribution changes with the contextual predictive distribution computed from the posterior dataset, and this is unfortunately the case for most of the distribution of the task-specific latent variable aforementioned works. For example, the variance of the conditioned on the contextual dataset, and this posterior in these works do not converge to 0 as we take justifies the philosophy of Neural Process.

artificial intelligence, machine learning, posterior distribution, (15 more...)

2006.01488

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.70)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Reiman, David M., Tamanas, John, Prochaska, J. Xavier, Ďurovčíková, Dominika

Fully probabilistic quasar continua predictions near Lyman-{\alpha} with conditional neural spline flows

Measurement of the red damping wing of neutral hydrogen in quasar spectra provides a probe of the epoch of reionization in the early Universe. Such quantification requires precise and unbiased estimates of the intrinsic continua near Lyman-$\alpha$ (Ly$\alpha$), a challenging task given the highly variable Ly$\alpha$ emission profiles of quasars. Here, we introduce a fully probabilistic approach to intrinsic continua prediction. We frame the problem as a conditional density estimation task and explicitly model the distribution over plausible blue-side continua ($1190\ \unicode{xC5} \leq \lambda_{\text{rest}} < 1290\ \unicode{xC5}$) conditional on the red-side spectrum ($1290\ \unicode{xC5} \leq \lambda_{\text{rest}} < 2900\ \unicode{xC5}$) using normalizing flows. Our approach achieves state-of-the-art precision and accuracy, allows for sampling one thousand plausible continua in less than a tenth of a second, and can natively provide confidence intervals on the blue-side continua via Monte Carlo sampling. We measure the damping wing effect in two $z>7$ quasars and estimate the volume-averaged neutral fraction of hydrogen from each, finding $\bar{x}_\text{HI}=0.304 \pm 0.042$ for ULAS J1120+0641 ($z=7.09$) and $\bar{x}_\text{HI}=0.384 \pm 0.133$ for ULAS J1342+0928 ($z=7.54$).

artificial intelligence, machine learning, prediction, (20 more...)

2006.00615

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Utah (0.04)
(13 more...)

Genre: Research Report > New Finding (0.93)

Industry: Energy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Cai, Desmond, Nguyen, Duc Thien, Lim, Shiau Hong, Wynter, Laura

Variational Bayesian Inference for Crowdsourcing Predictions

arXiv.org Artificial IntelligenceJun-1-2020

Crowdsourcing has emerged as an effective means for performing a number of machine learning tasks such as annotation and labelling of images and other data sets. In most early settings of crowdsourcing, the task involved classification, that is assigning one of a discrete set of labels to each task. Recently, however, more complex tasks have been attempted including asking crowdsource workers to assign continuous labels, or predictions. In essence, this involves the use of crowdsourcing for function estimation. We are motivated by this problem to drive applications such as collaborative prediction, that is, harnessing the wisdom of the crowd to predict quantities more accurately. To do so, we propose a Bayesian approach aimed specifically at alleviating overfitting, a typical impediment to accurate prediction models in practice. In particular, we develop a variational Bayesian technique for two different worker noise models - one that assumes workers' noises are independent and the other that assumes workers' noises have a latent low-rank structure. Our evaluations on synthetic and real-world datasets demonstrate that these Bayesian approaches perform significantly better than existing non-Bayesian approaches and are thus potentially useful for this class of crowdsourcing problems.

artificial intelligence, machine learning, prediction, (19 more...)

arXiv.org Artificial Intelligence

2006.00778

Country:

Europe > Spain > Canary Islands (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJun-1-2020

Sampling Techniques in Bayesian Target Encoding

Larionov, Michael

Target encoding is an effective encoding technique of categorical variables and is often used in machine learning systems for processing tabular data sets with mixed numeric and categorical variables. Recently en enhanced version of this encoding technique was proposed by using conjugate Bayesian modeling. This paper presents a further development of Bayesian encoding method by using sampling techniques, which helps in extracting information from intra-category distribution of the target variable, improves generalization and reduces target leakage.

categorical variable, category, posterior distribution, (14 more...)

2006.01317

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)