gumbel-crf
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia > South Korea (0.04)
Latent Template Induction with Gumbel-CRFs
Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.
- North America > United States > Massachusetts (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)
- (2 more...)
Clarify Technical Contributions (R3 / R4): 2 Gradient Estimation
We thank all reviewers for their detailed constructive feedback and suggestions. Table B (below) demonstrates this empirically. Gumbel-Softmax has) with significantly less training time and resource consumption. These experiments show that when trained with Gumbel-CRF, the AR decoder outperforms REINFORCE. We will clarify this in the paper.
Review for NeurIPS paper: Latent Template Induction with Gumbel-CRFs
Summary and Contributions: Neural template models [1, 2] are interesting with interpretability and controllability. Such models can be trained in a VAE framework with CRF as the posterior. Due to the non-differentiability of discrete templates, this paper investigates using Gumbel-Softmax as the gradient estimator for the posterior distribution against other gradient estimators such as REINFORCE and PM-MRF used by previous work. Empirically, the gumbel estimator demonstrates lower variance and better performance on unsupervised paraphrasing and data-to-text generation than comparable baselines. ACL 2020 ------------ After Rebuttal -------------- Thank you for the response!
Review for NeurIPS paper: Latent Template Induction with Gumbel-CRFs
This paper consider the text generation task in a VAE framework where the latent variables of a CRF are used as template of generation. The paper uses Gumbel-Softmax as the gradient estimator for the posterior distribution. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than other gradient estimators such as REINFORCE and PM-MRF. The proposed method are tested in a variety of text modelling tasks. Reviewers agree this is an important and difficult problem.
Latent Template Induction with Gumbel-CRFs
Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators.
Latent Template Induction with Gumbel-CRFs
Fu, Yao, Tan, Chuanqi, Bi, Bin, Chen, Mosha, Feng, Yansong, Rush, Alexander M.
Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.
- North America > United States > Massachusetts (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)