Goto

Collaborating Authors

 Machine Translation



Blockwise Parallel Decoding for Deep Autoregressive Models

Neural Information Processing Systems

To overcome this limitation, we propose a novel blockwise parallel decoding scheme in which we make predictions for multiple time steps in parallel then back off to the longest prefix validated by a scoring model.



On the Dimensionality of Word Embedding

Neural Information Processing Systems

In this paper, we provide a theoretical understanding of word embedding and its dimensionality. Motivated by the unitary-invariance of word embedding, we propose the Pairwise Inner Product (PIP) loss, a novel metric on the dissimilarity between word embeddings. Using techniques from matrix perturbation theory, we reveal a fundamental bias-variance trade-off in dimensionality selection for word embeddings. This bias-variance trade-off sheds light on many empirical observations which were previously unexplained, for example the existence of an optimal dimensionality. Moreover, new insights and discoveries, like when and how word embeddings are robust to over-fitting, are revealed. By optimizing over the bias-variance trade-off of the PIP loss, we can explicitly answer the open question of dimensionality selection for word embedding.





Adaptive Methods for Nonconvex Optimization

Neural Information Processing Systems

Equal Contribution 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montrรฉal, Canada. is often attributed to the rapid decay in the learning rate when gradients are dense, which is often the case in many machine learning applications.


TETRIS: TilE-matching the TRemendous Irregular Sparsity

Neural Information Processing Systems

Compressing neural networks by pruning weights with small magnitudes can significantly reduce the computation and storage cost. Although pruning makes the model smaller, it is difficult to get a practical speedup in modern computing platforms such as CPU and GPU due to the irregularity.


Content preserving text generation with attribute controls

Neural Information Processing Systems

In this work, we address the problem of modifying textual attributes of sentences. Given an input sentence and a set of attribute labels, we attempt to generate sentences that are compatible with the conditioning information.