Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks
From Table II and Figure 1 and 2, GRU1 and GRU2 perform almost as well as GRU0 on MNIST pixel-wise generated sequence inputs. While GRU3 does not perform as well for this (constant base) learning rate. Figure 3 shows that reducing the (constant base) learning rate to (0.0001) and below has enabled GRU3 to increase its (test) accuracy performance to 59.6% after 100 epochs, and with a positive slope indicating that it would increase further after more epochs. Note that in this experiment, GRU3 has about 33% of the number of (adaptively computed) parameters compared to GRU0. Thus, there exists a potential tradeoff between the higher accuracy performance and the decrease in the number of parameters.
Jan-20-2017
- Country:
- North America > United States > Michigan > Ingham County (0.14)
- Genre:
- Research Report (0.82)
- Technology: