Review for NeurIPS paper: Learning Sparse Prototypes for Text Generation
–Neural Information Processing Systems
This paper builds upon Guu et al. (2018)'s prototype-driven text generation approach. Two major changes are made: first, modeling a sparse distribution over prototypes with a Dirichlet prior over a multinomial, and second, actually learning this sparse distribution. At training time, the paper uses amortized variational inference, further approximating the gradients using REINFORCE to deal with the large number of prototypes. At inference time, they can keep fewer training examples in memory by filtering only those whose posterior probability is larger than a threshold. Thus both the memory required to store training examples and the time spent on retrieving training examples is reduced.
Neural Information Processing Systems
Jan-27-2025, 10:19:45 GMT
- Technology: