Direct Feedback Alignment with Sparse Connections for Local Learning
Crafton, Brian, Parihar, Abhinav, Gebhardt, Evan, Raychowdhury, Arijit
Recent advances in deep neural networks (DNNs) owe their success to training algorithms that use backpropagation and gradientdescent. Backpropagation, while highly effective on von Neumann architectures, becomes inefficient when scaling to large networks. Commonly referred to as the weight transport problem, each neuron's dependence on the weights and errors located deeper in the network require exhaustive data movement which presents a key problem in enhancing the performance and energy-efficiency of machine-learning hardware. In this work, we propose a bio-plausible alternative to backpropagation drawing from advances in feedback alignment algorithms in which the error computation at a single synapse reduces to the product of three scalar values, satisfying the three factor rule. Using a sparse feedback matrix, we show that a neuron needs only a fraction of the information previously used by the feedback alignment algorithms to yield results which are competitive with backpropagation. Consequently, memory and compute can be partitioned and distributed whichever way produces the most efficient forward pass so long as a single error can be delivered to each neuron. We evaluate our algorithm using standard data sets, including ImageNet, to address the concern of scaling to challenging problems. Our results show orders of magnitude improvement in data movement and 2 improvement in multiply-and-accumulate operations over backpropagation. All the code and results are available under https://github.com/bcrafton/ssdfa. I. INTRODUCTION The demise of Dennard scaling [11] and decline of Moores Law [27] have exposed the fundamental scaling limitations of the von Neumann models of computing. This transition is accompanied by the realization that in a fast evolving, socially interconnected world, we are observing a seismic shift in the amount of unstructured data that need to be processed in real-time [25] which has heralded the third wave of Artificial Intelligence and the exponential growth of Machine Learning in data-analytics, real-time control, computer vision, robotics and so on. We expect that intelligent systems of the future will be limited by the energy growth of data movement rather than compute.
Jan-30-2019