ARelated Work

Neural Information Processing Systems 

We note that these results are about two of the most commonly used architecture modifications for RNNs. First, the gating mechanism is ubiquitous in RNNs, and usually thought of as a heuristic for smoothing optimization [28]. Second, many of the effective large-scale RNNs use linear (gated) recurrences and deeper models, which is usually thought of as a heuristic for computational efficiency [5]. Our results suggest that neither of these are heuristics after all, and arise from standard ways to approximate ODEs. To be more specific, we show that: 19 Table 6: A summary of the characteristics of popular RNN methods and their approximation mechanisms for capturing the dynamics x(t) = x(t) + f(t,x(t)) (equation (14)). The LSSL entries are for the very specific case with order N = 1 and A= 1,B = 1,C = 1,D= 0; LSSLs are more general.

Duplicate Docs Excel Report

Similar Docs  Excel Report  more

TitleSimilaritySource
None found