exponential explosion
Stabilizing RNN Gradients through Pre-training
Herranz-Celotti, Luca, Rouat, Jean
Numerous theories of learning propose to prevent the gradient from exponential growth with depth or time, to stabilize and improve training. Typically, these analyses are conducted on feed-forward fully-connected neural networks or simple single-layer recurrent neural networks, given their mathematical tractability. In contrast, this study demonstrates that pre-training the network to local stability can be effective whenever the architectures are too complex for an analytical initialization. Furthermore, we extend known stability theories to encompass a broader family of deep recurrent networks, requiring minimal assumptions on data and parameter distribution, a theory we call the Local Stability Condition (LSC). Our investigation reveals that the classical Glorot, He, and Orthogonal initialization schemes satisfy the LSC when applied to feed-forward fully-connected neural networks. However, analysing deep recurrent networks, we identify a new additive source of exponential explosion that emerges from counting gradient paths in a rectangular grid in depth and time. We propose a new approach to mitigate this issue, that consists on giving a weight of a half to the time and depth contributions to the gradient, instead of the classical weight of one. Our empirical results confirm that pre-training both feed-forward and recurrent networks, for differentiable, neuromorphic and state-space models to fulfill the LSC, often results in improved final performance. This study contributes to the field by providing a means to stabilize networks of any complexity. Our approach can be implemented as an additional step before pre-training on large augmented datasets, and as an alternative to finding stable initializations analytically.
US will see an 'exponential explosion' in COVID-19 cases if it relaxes lockdown measures early
Ending the US coronavirus lockdown too early could lead to an explosion of new coronavirus cases, according to a study modelling the spread of the virus. Researchers from the Massachusetts Institute of Technology (MIT) created a model showing the spread of the deadly virus using publicly available data from Wuhan, Italy, South Korea and the USA. The authors say that any immediate or near-term relaxation of quarantine measures already in place in the US would lead to an'exponential explosion' in COVID-19 cases. It comes as President Donald Trump announced a new three-phase plan to reopen the country that'allows' governors to decide when their states should come out of lockdown measures. The plan provided only a general idea of how and when states would be able to reopen - shying away from specific details or a timeline. 'To preserve the health of our citizens we must also preserve the health and functioning of our economy,' said Trump.