Mono-Forward: Backpropagation-Free Algorithm for Efficient Neural Network Training Harnessing Local Errors

Gong, James, Li, Bruce, Abdulla, Waleed

arXiv.org Artificial Intelligence 

Backpropagation [1], while foundational in training neural networks [2], faces critical limitations in both deep learning and neuroscience, highlighting the importance of exploring alternative methodologies. The concept of Backward Locking exemplifies a significant bottleneck inherent to BP, where weight updates across the network must await the completion of both forward and backward passes for each data batch, hampering the efficient distribution of computation and parallelization across the network [3-5]. In BP, error gradients can exhibit significant variations in magnitude as they are propagated backwards through the network's layers, leading to two prominent issues: vanishing and exploding gradients. Vanishing gradients occur when the gradients diminish to such small values that they fail to effectively update the weights of earlier layers, thus severely hindering the training of deep neural networks. On the other hand, exploding gradients present a challenge by causing disproportionately large updates to the weights, potentially destabilizing the network.