Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks

Dey, Rahul, Salem, Fathi M.

arXiv.org Machine Learning 

From Table II and Figure 1 and 2, GRU1 and GRU2 perform almost as well as GRU0 on MNIST pixel-wise generated sequence inputs. While GRU3 does not perform as well for this (constant base) learning rate. Figure 3 shows that reducing the (constant base) learning rate to (0.0001) and below has enabled GRU3 to increase its (test) accuracy performance to 59.6% after 100 epochs, and with a positive slope indicating that it would increase further after more epochs. Note that in this experiment, GRU3 has about 33% of the number of (adaptively computed) parameters compared to GRU0. Thus, there exists a potential tradeoff between the higher accuracy performance and the decrease in the number of parameters.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found