AITopics | preventing gradient explosion

Collaborating Authors

preventing gradient explosion

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Preventing Gradient Explosions in Gated Recurrent Units

Neural Information Processing SystemsNov-21-2025, 16:16:23 GMT

A gated recurrent unit (GRU) is a successful recurrent neural network architecture for time-series data. The GRU is typically trained using a gradient-based method, which is subject to the exploding gradient problem in which the gradient increases significantly. This problem is caused by an abrupt change in the dynamics of the GRU due to a small variation in the parameters. In this paper, we find a condition under which the dynamics of the GRU changes drastically and propose a learning method to address the exploding gradient problem. Our method constrains the dynamics of the GRU so that it does not drastically change. We evaluated our method in experiments on language modeling and polyphonic music modeling. Our experiments showed that our method can prevent the exploding gradient problem and improve modeling accuracy.

gated recurrent unit, name change, preventing gradient explosion, (4 more...)

Neural Information Processing Systems

Industry:

Media > Music (0.62)
Leisure & Entertainment (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reviews: Preventing Gradient Explosions in Gated Recurrent Units

Neural Information Processing SystemsOct-8-2024, 12:50:23 GMT

Summary The authors propose a method for optimizing GRU networks which aims to prevent exploding gradients. They motivate the method by showing that a constraint on the spectral norm of the state-to-state matrix keeps the dynamics of the network stable near the fixed point 0. The method is evaluated on language modelling and a music prediction task and leads to stable training in comparison to weight clipping. Technical quality The motivation of the method is well developed and it is nice that the method is evaluated on two different real-world datasets. However, one important issue I have with the evaluation is that the learning rate is not controlled for in the experiments. Unfortunately, this makes it hard to draw strong conclusions from the results.

gated recurrent unit, grus, preventing gradient explosion, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.65)

Add feedback

Preventing Gradient Explosions in Gated Recurrent Units

Kanai, Sekitoshi, Fujiwara, Yasuhiro, Iwamura, Sotetsu

Neural Information Processing SystemsFeb-14-2020, 05:42:50 GMT

gated recurrent unit, gradient problem, preventing gradient explosion, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback