AITopics | gumbel-crf

Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.

gumbel-crf, latent template induction, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Latent Template Induction with Gumbel-CRFs

Neural Information Processing SystemsAug-17-2025, 02:45:55 GMT

Learning to control the structure of sentences is a challenging problem in text generation.

arxiv preprint arxiv, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.93)
(2 more...)

Add feedback

Clarify Technical Contributions (R3 / R4): 2 Gradient Estimation

Neural Information Processing SystemsAug-17-2025, 02:45:44 GMT

We thank all reviewers for their detailed constructive feedback and suggestions. Table B (below) demonstrates this empirically. Gumbel-Softmax has) with significantly less training time and resource consumption. These experiments show that when trained with Gumbel-CRF, the AR decoder outperforms REINFORCE. We will clarify this in the paper.

artificial intelligence, inductive learning, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.31)

Add feedback

Review for NeurIPS paper: Latent Template Induction with Gumbel-CRFs

Neural Information Processing SystemsFeb-7-2025, 17:32:40 GMT

Summary and Contributions: Neural template models [1, 2] are interesting with interpretability and controllability. Such models can be trained in a VAE framework with CRF as the posterior. Due to the non-differentiability of discrete templates, this paper investigates using Gumbel-Softmax as the gradient estimator for the posterior distribution against other gradient estimators such as REINFORCE and PM-MRF used by previous work. Empirically, the gumbel estimator demonstrates lower variance and better performance on unsupervised paraphrasing and data-to-text generation than comparable baselines. ACL 2020 ------------ After Rebuttal -------------- Thank you for the response!

gumbel-crf, latent template induction, neurips paper, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Review for NeurIPS paper: Latent Template Induction with Gumbel-CRFs

Neural Information Processing SystemsFeb-7-2025, 17:32:34 GMT

This paper consider the text generation task in a VAE framework where the latent variables of a CRF are used as template of generation. The paper uses Gumbel-Softmax as the gradient estimator for the posterior distribution. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than other gradient estimators such as REINFORCE and PM-MRF. The proposed method are tested in a variety of text modelling tasks. Reviewers agree this is an important and difficult problem.

gradient estimator, latent template induction, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.69)

Add feedback

Latent Template Induction with Gumbel-CRFs

Neural Information Processing SystemsJan-13-2025, 20:16:49 GMT

Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators.

continuous relaxation, gumbel-crf, latent template induction, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Latent Template Induction with Gumbel-CRFs

Fu, Yao, Tan, Chuanqi, Bi, Bin, Chen, Mosha, Feng, Yansong, Rush, Alexander M.

arXiv.org Artificial IntelligenceNov-28-2020

Learning to control the structure of sentences is a challenging problem in text generation. Existing work either relies on simple deterministic approaches or RL-based hard structures. We explore the use of structured variational autoencoders to infer latent templates for sentence generation using a soft, continuous relaxation in order to utilize reparameterization for training. Specifically, we propose a Gumbel-CRF, a continuous relaxation of the CRF sampling algorithm using a relaxed Forward-Filtering Backward-Sampling (FFBS) approach. As a reparameterized gradient estimator, the Gumbel-CRF gives more stable gradients than score-function based estimators. As a structured inference network, we show that it learns interpretable templates during training, which allows us to control the decoder during testing. We demonstrate the effectiveness of our methods with experiments on data-to-text generation and unsupervised paraphrase generation.

arxiv preprint arxiv, estimator, gumbel-crf, (14 more...)

arXiv.org Artificial Intelligence

2011.14244

Country: