Goto

Collaborating Authors

 reparametrization trick


A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

Neural Information Processing Systems

We show how to learn a neural topic model with discrete random variables---one that explicitly models each word's assigned topic---using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences of text with the topic model's ability to capture global, thematic coherence. Using neural variational inference, we show improved perplexity and document understanding across multiple corpora. We examine the effect of prior parameters both on the model and variational parameters, and demonstrate how our approach can compete and surpass a popular topic model implementation on an automatic measure of topic quality.


How reparametrization trick broke differentially-private text representation learning

Habernal, Ivan

arXiv.org Artificial Intelligence

As privacy gains traction in the NLP community, researchers have started adopting various approaches to privacy-preserving methods. One of the favorite privacy frameworks, differential privacy (DP), is perhaps the most compelling thanks to its fundamental theoretical guarantees. Despite the apparent simplicity of the general concept of differential privacy, it seems non-trivial to get it right when applying it to NLP. In this short paper, we formally analyze several recent NLP papers proposing text representation learning using DPText (Beigi et al., 2019a,b; Alnasser et al., 2021; Beigi et al., 2021) and reveal their false claims of being differentially private. Furthermore, we also show a simple yet general empirical sanity check to determine whether a given implementation of a DP mechanism almost certainly violates the privacy loss guarantees. Our main goal is to raise awareness and help the community understand potential pitfalls of applying differential privacy to text representation learning.


Review for NeurIPS paper: A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

Neural Information Processing Systems

Summary and Contributions: In this paper, the authors attempted to utilize neural variational inference to construct a neural topic model with discrete random variables, and proposed one model, namely VRTM, which combine1. The exploration of combining RNNs and topic models is interesting and significant, which can help topic models to handle sequence text and capture more text information than the bag-of-word model, which is prevalently utilized in LDA-based topic models. Specifically, when facing the thematic words, VRTM uses both the RNN and topic model predications to generative the next word; however, when facing the syntactic words, only the output of the RNN is utilized to predict the next word. In particular, during the generative process, the discrete topic assignment has been attached to each thematic word, which is beneficial for the Interpretability. To be specific, the authors first designed one reasonable generative model, which can apply different strategies for generating thematic and syntactic words with different inputs, i.e., a mixture of LDA and RNN predications or just the output of the RNN.


Review for NeurIPS paper: A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

Neural Information Processing Systems

Reviews are all on the accept side: 1 top 50% of accepted and 3 marginally above threshold. Only R4 (strong accept) intervened in the discussion. As the main reason for calling this paper borderline was limited novelty compared to [7], I had to proceed to a detailed comparative rereading of this paper to [7]. In my opinion, this approach is very different from [7]. While the authors presented it as only introducing a small modeling difference from [7], this has a huge impact on everything, in particular the resulting DNN architecture and the inference process.


A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

Neural Information Processing Systems

We show how to learn a neural topic model with discrete random variables---one that explicitly models each word's assigned topic---using neural variational inference that does not rely on stochastic backpropagation to handle the discrete variables. The model we utilize combines the expressive power of neural methods for representing sequences of text with the topic model's ability to capture global, thematic coherence. Using neural variational inference, we show improved perplexity and document understanding across multiple corpora. We examine the effect of prior parameters both on the model and variational parameters, and demonstrate how our approach can compete and surpass a popular topic model implementation on an automatic measure of topic quality.


spred: Solving $L_1$ Penalty with SGD

Ziyin, Liu, Wang, Zihao

arXiv.org Artificial Intelligence

We propose to minimize a generic differentiable objective with $L_1$ constraint using a simple reparametrization and straightforward stochastic gradient descent. Our proposal is the direct generalization of previous ideas that the $L_1$ penalty may be equivalent to a differentiable reparametrization with weight decay. We prove that the proposed method, \textit{spred}, is an exact differentiable solver of $L_1$ and that the reparametrization trick is completely ``benign" for a generic nonconvex function. Practically, we demonstrate the usefulness of the method in (1) training sparse neural networks to perform gene selection tasks, which involves finding relevant features in a very high dimensional space, and (2) neural network compression task, to which previous attempts at applying the $L_1$-penalty have been unsuccessful. Conceptually, our result bridges the gap between the sparsity in deep learning and conventional statistical learning.


Hardware-aware Training Techniques for Improving Robustness of Ex-Situ Neural Network Transfer onto Passive TiO2 ReRAM Crossbars

Drolet, Philippe, Dawant, Raphaël, Yon, Victor, Mouny, Pierre-Antoine, Valdenaire, Matthieu, Zapata, Javier Arias, Gliech, Pierre, Wood, Sean U. N., Ecoffey, Serge, Alibart, Fabien, Beilliard, Yann, Drouin, Dominique

arXiv.org Artificial Intelligence

Passive resistive random access memory (ReRAM) crossbar arrays, a promising emerging technology used for analog matrix-vector multiplications, are far superior to their active (1T1R) counterparts in terms of the integration density. However, current transfers of neural network weights into the conductance state of the memory devices in the crossbar architecture are accompanied by significant losses in precision due to hardware variabilities such as sneak path currents, biasing scheme effects and conductance tuning imprecision. In this work, training approaches that adapt techniques such as dropout, the reparametrization trick and regularization to TiO2 crossbar variabilities are proposed in order to generate models that are better adapted to their hardware transfers. The viability of this approach is demonstrated by comparing the outputs and precision of the proposed hardware-aware network with those of a regular fully connected network over a few thousand weight transfers using the half moons dataset in a simulation based on experimental data. For the neural network trained using the proposed hardware-aware method, 79.5% of the test set's data points can be classified with an accuracy of 95% or higher, while only 18.5% of the test set's data points can be classified with this accuracy by the regularly trained neural network.


Disentangling Variational Autoencoders

Pastrana, Rafael

arXiv.org Artificial Intelligence

A variational autoencoder (VAE) is a probabilistic machine learning framework for posterior inference that projects an input set of high-dimensional data to a lower-dimensional, latent space. The latent space learned with a VAE offers exciting opportunities to develop new data-driven design processes in creative disciplines, in particular, to automate the generation of multiple novel designs that are aesthetically reminiscent of the input data but that were unseen during training. However, the learned latent space is typically disorganized and entangled: traversing the latent space along a single dimension does not result in changes to single visual attributes of the data. The lack of latent structure impedes designers from deliberately controlling the visual attributes of new designs generated from the latent space. This paper presents an experimental study that investigates latent space disentanglement. We implement three different VAE models from the literature and train them on a publicly available dataset of 60,000 images of hand-written digits. We perform a sensitivity analysis to find a small number of latent dimensions necessary to maximize a lower bound to the log marginal likelihood of the data. Furthermore, we investigate the trade-offs between the quality of the reconstruction of the decoded images and the level of disentanglement of the latent space. We are able to automatically align three latent dimensions with three interpretable visual properties of the digits: line weight, tilt and width. Our experiments suggest that i) increasing the contribution of the Kullback-Leibler divergence between the prior over the latents and the variational distribution to the evidence lower bound, and ii) conditioning input image class enhances the learning of a disentangled latent space with a VAE.


Understanding AutoEncoders with an Example: A Step-by-Step Tutorial

#artificialintelligence

This is the second (and last) article of the "Understanding AutoEncoders with an example" series. In the first article, we generated a synthetic dataset and built a vanilla autoencoder to reconstruct images of circles. We'll be using the same dataset once again, so please check the section "An MNIST-like Dataset of Circles" for a refresher, if needed. We'll also understand what the famous reparametrization trick is, and the role of the Kullback-Leibler divergence/loss. You're invited to read this series of articles while running its accompanying notebook, available on my GitHub's "Accompanying Notebooks" repository, using Google Colab: Moreover, I built a Table of Contents to help you navigate the topics across the two articles, should you use it as a mini-course and work your way through the content one topic at a time.


Diffusion Variational Autoencoders

Rey, Luis A. Pérez, Menkovski, Vlado, Portegies, Jacobus W.

arXiv.org Machine Learning

A standard Variational Autoencoder, with a Euclidean latent space, is structurally incapable of capturing topological properties of certain datasets. To remove topological obstructions, we introduce Diffusion Variational Autoencoders with arbitrary manifolds as a latent space. A Diffusion Variational Autoencoder uses transition kernels of Brownian motion on the manifold. In particular, it uses properties of the Brownian motion to implement the reparametrization trick and fast approximations to the KL divergence. We show that the Diffusion Variational Autoencoder is capable of capturing topological properties of synthetic datasets. Additionally, we train MNIST on spheres, tori, projective spaces, SO(3), and a torus embedded in R3. Although a natural dataset like MNIST does not have latent variables with a clear-cut topological structure, training it on a manifold can still highlight topological and geometrical properties.