The research and development of neural networks is flourishing thanks to recent advancements in computational power, the discovery of new algorithms, and an increase in labelled data. Before the current explosion of activity in the space, the practical applications of neural networks were limited. Much of the recent research has allowed for broad application, the heavy computational requirements for machine learning models still restrain it from truly entering the mainstream. Now, emerging algorithms are on the cusp of pushing neural networks into more conventional applications through exponentially increased efficiency. Neural networks are a prominent focal point in the current state of computer science research.
The rapid growth of data size and accessibility in recent years has instigated a shift of philosophy in algorithm design for artificial intelligence. Instead of engineering algorithms by hand, the ability to learn composable systems automatically from massive amounts of data has led to ground-breaking performance in important domains such as computer vision, speech recognition, and natural language processing. The most popular class of techniques used in these domains is called deep learning, and is seeing significant attention from industry. However, these models require incredible amounts of data and compute power to train, and are limited by the need for better hardware acceleration to accommodate scaling beyond current data and model sizes. While the current solution has been to use clusters of graphics processing units (GPU) as general purpose processors (GPGPU), the use of field programmable gate arrays (FPGA) provide an interesting alternative. Current trends in design tools for FPGAs have made them more compatible with the high-level software practices typically practiced in the deep learning community, making FPGAs more accessible to those who build and deploy models. Since FPGA architectures are flexible, this could also allow researchers the ability to explore model-level optimizations beyond what is possible on fixed architectures such as GPUs. As well, FPGAs tend to provide high performance per watt of power consumption, which is of particular importance for application scientists interested in large scale server-based deployment or resource-limited embedded applications. This review takes a look at deep learning and FPGAs from a hardware acceleration perspective, identifying trends and innovations that make these technologies a natural fit, and motivates a discussion on how FPGAs may best serve the needs of the deep learning community moving forward.
It is no secret that artificial intelligence (AI) and machine learning have advanced radically over the last decade, yet somewhere between better algorithms and faster processors lies the increasingly important task of engineering systems for maximum performance--and producing better results. The problem for now, says Nidhi Chappell, director of machine learning in the Datacenter Group at Intel, is that "AI experts spend far too much time preprocessing code and data, iterating on models and parameters, waiting for training to converge, and experimenting with deployment models. Each step along the way is either too labor-and/or compute-intensive." The research and development community--spearheaded by companies such as Nvidia, Microsoft, Baidu, Google, Facebook, Amazon, and Intel--is now taking direct aim at the challenge. Teams are experimenting, developing, and even implementing new chip designs, interconnects, and systems to boldly go where AI, deep learning, and machine learning have not gone before.