A Neural Probabilistic Language Model

Bengio, Yoshua, Ducharme, Réjean, Vincent, Pascal

Dec-31-2001–Neural Information Processing Systems

A goal of statistical language modeling is to learn the joint probability function of sequences of words. This is intrinsically difficult because of the curse of dimensionality: we propose to fight it with its own weapons. In the proposed approach one learns simultaneously (1) a distributed representation foreach word (i.e. a similarity between words) along with (2) the probability function for word sequences, expressed with these representations. Generalizationis obtained because a sequence of words that has never been seen before gets high probability if it is made of words that are similar to words forming an already seen sentence. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach very significantly improves ona state-of-the-art trigram model. 1 Introduction A fundamental problem that makes language modeling and other learning problems difficult isthe curse of dimensionality. It is particularly obvious in the case when one wants to model the joint distribution between many discrete random variables (such as words in a sentence, or discrete attributes in a data-mining task).

artificial intelligence, neural network, sequence, (18 more...)

Neural Information Processing Systems

Dec-31-2001

Conferences PDF

Add feedback

Country:
- North America
  - Canada > Quebec (0.14)
  - United States > Ohio (0.14)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (0.95)
    - Statistical Learning (0.74)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
A Neural Probabilistic Language Model
A Neural Probabilistic Language Model

Similar Docs Excel Report more

Title	Similarity	Source
None found