LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu
–Neural Information Processing Systems
Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.
Neural Information Processing Systems
Jan-20-2025, 19:06:29 GMT