Stanford's TETRIS Clears Blocks for 3D Memory Based Deep Learning
The need for speed to process neural networks is far less a matter of processor capabilities and much more a function of memory bandwidth. As the compute capability rises, so too does the need to keep the chips fed with data--something that often requires going off chip to memory. That not only comes with a performance penalty, but an efficiency hit as well, which explains why so many efforts are being made to either speed that connection to off-chip memory or, more efficiently, doing as much in memory as possible. The advent of 3D or stacked memory opens new doors, especially for those with deep learning workloads. We have already talked about how memory is the next platform for machine learning, and have explored a number of architectures that seek to maximize on-chip memory by making it handle at least some of the compute via accumulation engines.
Mar-8-2017, 12:50:15 GMT