AITopics | discrete latent variable model

Collaborating Authors

discrete latent variable model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Neural Information Processing SystemsOct-8-2024, 12:29:59 GMT

Summary This paper proposes a control variate (CV) for the discrete distribution's REINFORCE gradient estimator (RGE). The CV is based on the Concrete distribution (CD), a continuous relaxation of the discrete distribution that admits only biased Monte Carlo (MC) estimates of the discrete distribution's gradient. Yet, using the CD as a CV results in an *unbiased* estimator for a discrete random variable's (rv) path gradient as well as lower variance than the RGE (as expected). REBAR is derived by exploiting the REINFORCE estimator for the CD and by observing that given a discrete draw, the CD's continuous parameter (z, here) can be marginalized out. REBAR has some nice connections to other estimators for discrete rv gradients, including MuProp.

discrete latent variable model, estimator, rebar, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.40)

Add feedback

Fair Inference for Discrete Latent Variable Models

Islam, Rashidul, Pan, Shimei, Foulds, James R.

arXiv.org Artificial IntelligenceSep-15-2022

It is now well understood that machine learning models, trained on data without due care, often exhibit unfair and discriminatory behavior against certain populations. Traditional algorithmic fairness research has mainly focused on supervised learning tasks, particularly classification. While fairness in unsupervised learning has received some attention, the literature has primarily addressed fair representation learning of continuous embeddings. In this paper, we conversely focus on unsupervised learning using probabilistic graphical models with discrete latent variables. We develop a fair stochastic variational inference technique for the discrete latent variables, which is accomplished by including a fairness penalty on the variational distribution that aims to respect the principles of intersectionality, a critical lens on fairness from the legal, social science, and humanities literature, and then optimizing the variational parameters under this penalty. We first show the utility of our method in improving equity and fairness for clustering using na\"ive Bayes and Gaussian mixture models on benchmark datasets. To demonstrate the generality of our approach and its potential for real-world impact, we then develop a special-purpose graphical model for criminal justice risk assessments, and use our fairness approach to prevent the inferences from encoding unfair societal biases.

artificial intelligence, fairness, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2209.07044

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Government (1.00)
Law > Criminal Law (0.67)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Comparison of Discrete Latent Variable Models for Speech Representation Learning

Zhou, Henry, Baevski, Alexei, Auli, Michael

arXiv.org Artificial IntelligenceOct-23-2020

Neural latent variable models enable the discovery of interesting structure in speech audio data. This paper presents a comparison of two different approaches which are broadly based on predicting future time-steps or auto-encoding the input signal. Our study compares the representations learned by vq-vae and vq-wav2vec in terms of sub-word unit discovery and phoneme recognition performance. Results show that future time-step prediction with vq-wav2vec achieves better performance. The best system achieves an error rate of 13.22 on the ZeroSpeech 2019 ABX phoneme discrimination challenge.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2010.1423

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.85)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.69)

Add feedback

Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator

Paulus, Max B., Maddison, Chris J., Krause, Andreas

arXiv.org Machine LearningOct-9-2020

Gradient estimation in models with discrete latent variables is a challenging problem, because the simplest unbiased estimators tend to have high variance. To counteract this, modern estimators either introduce bias, rely on multiple function evaluations, or use learned, input-dependent baselines. Thus, there is a need for estimators that require minimal tuning, are computationally cheap, and have low mean squared error. In this paper, we show that the variance of the straight-through variant of the popular Gumbel-Softmax estimator can be reduced through Rao-Blackwellization without increasing the number of function evaluations. This provably reduces the mean squared error. We empirically demonstrate that this leads to variance reduction, faster convergence, and generally improved performance in two unsupervised latent variable models.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

2010.04838

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Tucker, George, Mnih, Andriy, Maddison, Chris J., Lawson, John, Sohl-Dickstein, Jascha

Neural Information Processing SystemsFeb-14-2020, 10:57:29 GMT

Learning in models with discrete latent variables is challenging due to high variance gradient estimators. Generally, approaches have relied on control variates to reduce the variance of the REINFORCE estimator. Recent work \citep{jang2016categorical, maddison2016concrete} has taken a different approach, introducing a continuous relaxation of discrete variables to produce low-variance, but biased, gradient estimates. In this work, we combine the two approaches through a novel control variate that produces low-variance, \emph{unbiased} gradient estimates. Then, we introduce a modification to the continuous relaxation and show that the tightness of the relaxation can be adapted online, removing it as a hyperparameter.

discrete latent variable model, gradient estimate, unbiased gradient estimate, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.40)

Add feedback

Theory and Experiments on Vector Quantized Autoencoders

Roy, Aurko, Vaswani, Ashish, Neelakantan, Arvind, Parmar, Niki

arXiv.org Machine LearningMay-28-2018

Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks. There has been a surge in interest in discrete latent variable models, however, despite several recent improvements, the training of discrete latent variable models has remained challenging and their performance has mostly failed to match their continuous counterparts. Recent work on vector quantized autoencoders (VQ-VAE) has made substantial progress in this direction, with its perplexity almost matching that of a VAE on datasets such as CIFAR-10. In this work, we investigate an alternate training technique for VQ-VAE, inspired by its connection to the Expectation Maximization (EM) algorithm. Training the discrete bottleneck with EM helps us achieve better image generation results on CIFAR-10, and together with knowledge distillation, allows us to develop a non-autoregressive machine translation model whose accuracy almost matches a strong greedy autoregressive baseline Transformer, while being 3.3 times faster at inference.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Machine Learning

1805.11063

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback