AITopics | lightrnn

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Neural Information Processing SystemsMar-17-2026, 11:05:59 GMT

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

artificial intelligence, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Neural Information Processing SystemsNov-21-2025, 15:21:45 GMT

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

computation-efficient recurrent neural network, lightrnn, vector, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

Add feedback

Reviews: LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Neural Information Processing SystemsJan-20-2025, 19:06:30 GMT

This work provides a novel and effective way to reduce the number of parameters for models that require handling of large vocabularies. The large drop in model size by several orders of magnitude could effectively allow some large models to be ported to the phone, which may not have been possible previously. I find it really interesting that a single method can improve both input parameter size and output size whereas previous work on softmaxes have only tackled the output side. However, I find that some technical details are lacking and the description can be confusing in some places. In particular, I find figure 2 and the unnumbered equation after Eq 1 confusing.

computation-efficient recurrent neural network, lightrnn, review, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks Xiang Li1 Tao Qin 2 Jian Yang

Neural Information Processing SystemsApr-10-2023, 10:40:53 GMT

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Li, Xiang, Qin, Tao, Yang, Jian, Liu, Tie-Yan

Neural Information Processing SystemsFeb-14-2020, 15:56:58 GMT

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector.

computation-efficient recurrent neural network, lightrnn, vector, (6 more...)

Neural Information Processing Systems

Genre: Research Report (0.57)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.63)

Add feedback

Slim Embedding Layers for Recurrent Neural Language Models

Li, Zhongliang (Wright State University) | Kulhanek, Raymond (Wright State University) | Wang, Shaojun (SVAIL, Baidu Research) | Zhao, Yunxin (University of Missouri) | Wu, Shuang (Yitu. Inc)

AAAI ConferencesFeb-8-2018

Recurrent neural language models are the state-of-the-art models for language modeling. When the vocabulary size is large, the space taken to store the model parameters becomes the bottleneck for the use of recurrent neural language models. In this paper, we introduce a simple space compression method that randomly shares the structured parameters at both the input and output embedding layers of the recurrent neural language models to significantly reduce the size of model parameters, but still compactly represent the original input and output embedding layers. The method is easy to implement and tune. Experiments on several data sets showthat the new method can get similar perplexity and BLEU score results whileonly using a very tiny fraction of parameters.

language model, output layer, vector, (15 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > Missouri (0.04)
North America > United States > Ohio (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Li, Xiang, Qin, Tao, Yang, Jian, Liu, Tie-Yan

Neural Information Processing SystemsDec-31-2016

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector. Depending on its position in the table, a word is jointly represented by two components: a row vector and a column vector. Since the words in the same row share the row vector and the words in the same column share the column vector, we only need $2 \sqrt{|V|}$ vectors to represent a vocabulary of $|V|$ unique words, which are far less than the $|V|$ vectors required by existing approaches. Based on the 2-Component shared embedding, we design a new RNN algorithm and evaluate it using the language modeling task on several benchmark datasets. The results show that our algorithm significantly reduces the model size and speeds up the training process, without sacrifice of accuracy (it achieves similar, if not better, perplexity as compared to state-of-the-art language models). Remarkably, on the One-Billion-Word benchmark Dataset, our algorithm achieves comparable perplexity to previous language models, whilst reducing the model size by a factor of 40-100, and speeding up the training process by a factor of 2. We name our proposed algorithm \emph{LightRNN} to reflect its very small model size and very high training speed.

lightrnn, model size, vector, (12 more...)

Neural Information Processing Systems

Country: