Learning long-term dependencies is not as difficult with NARX networks

Lin, Tsungnan, Horne, Bill G., Tiño, Peter, Giles, C. Lee

Dec-31-1996–Neural Information Processing Systems

It has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies. In this paper we explore this problem for a class of architectures called NARX networks, which have powerful representational capabilities. Previous work reported that gradient descent learning is more effective in NARX networks than in recurrent networks with "hidden states". We show that although NARX networks do not circumvent the problem of long-term dependencies, they can greatly improve performance on such problems. We present some experimental'results that show that NARX networks can often retain information for two to three times as long as conventional recurrent networks.

dependency, long-term dependency, narx network, (15 more...)

Neural Information Processing Systems

Dec-31-1996

Conferences PDF

Add feedback

Country:
- North America > United States
  - New Jersey > Mercer County
    - Princeton (0.14)
  - Maryland > Prince George's County
    - College Park (0.14)
  - Florida > Orange County
    - Orlando (0.04)
- Europe
  - Slovakia > Bratislava
    - Bratislava (0.04)
  - Germany > North Rhine-Westphalia
    - Upper Bavaria > Munich (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (0.56)
  - Neural Networks > Deep Learning (0.36)

Duplicate Docs Excel Report

Title
Learning long-term dependencies is not as difficult with NARX networks

Similar Docs Excel Report more

Title	Similarity	Source
None found