Goto

Collaborating Authors

 exponential explosion


Stabilizing RNN Gradients through Pre-training

arXiv.org Artificial Intelligence

Numerous theories of learning propose to prevent the gradient from exponential growth with depth or time, to stabilize and improve training. Typically, these analyses are conducted on feed-forward fully-connected neural networks or simple single-layer recurrent neural networks, given their mathematical tractability. In contrast, this study demonstrates that pre-training the network to local stability can be effective whenever the architectures are too complex for an analytical initialization. Furthermore, we extend known stability theories to encompass a broader family of deep recurrent networks, requiring minimal assumptions on data and parameter distribution, a theory we call the Local Stability Condition (LSC). Our investigation reveals that the classical Glorot, He, and Orthogonal initialization schemes satisfy the LSC when applied to feed-forward fully-connected neural networks. However, analysing deep recurrent networks, we identify a new additive source of exponential explosion that emerges from counting gradient paths in a rectangular grid in depth and time. We propose a new approach to mitigate this issue, that consists on giving a weight of a half to the time and depth contributions to the gradient, instead of the classical weight of one. Our empirical results confirm that pre-training both feed-forward and recurrent networks, for differentiable, neuromorphic and state-space models to fulfill the LSC, often results in improved final performance. This study contributes to the field by providing a means to stabilize networks of any complexity. Our approach can be implemented as an additional step before pre-training on large augmented datasets, and as an alternative to finding stable initializations analytically.


US will see an 'exponential explosion' in COVID-19 cases if it relaxes lockdown measures early

Daily Mail - Science & tech

Ending the US coronavirus lockdown too early could lead to an explosion of new coronavirus cases, according to a study modelling the spread of the virus. Researchers from the Massachusetts Institute of Technology (MIT) created a model showing the spread of the deadly virus using publicly available data from Wuhan, Italy, South Korea and the USA. The authors say that any immediate or near-term relaxation of quarantine measures already in place in the US would lead to an'exponential explosion' in COVID-19 cases. It comes as President Donald Trump announced a new three-phase plan to reopen the country that'allows' governors to decide when their states should come out of lockdown measures. The plan provided only a general idea of how and when states would be able to reopen - shying away from specific details or a timeline. 'To preserve the health of our citizens we must also preserve the health and functioning of our economy,' said Trump.