DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales
–Neural Information Processing Systems
After enough time has elapsed, the events that were presented close in time will gradually blend together, as illustrated in the bottom panel of Figure 1. We used the parameters presented in that work for the experiment with the adding problem used here. Table 1: Parameter values used for LSTM networks. Table 2: Parameter values used for LMU networks. "Coupled Oscillatory Recurrent Neural Network (coRNN): An Accurate and (Gradient) Stable Architecture for Learning Long Time Dependencies."
Neural Information Processing Systems
Nov-15-2025, 23:37:56 GMT