Datacenters, especially the really big guys known as the Super 7 (Alibaba,, Baidu, Facebook, Google, Microsoft and Tencent), are experiencing significant growth in key workloads that require more performance than can squeezed out of even the fastest CPUs. Applications such as Deep Neural Networks (DNN) for Artificial Intelligences (AIs), complex data analytics, 4K live streaming video and advanced networking and security features are increasingly being offloaded to super-fast accelerators that can provide 10X or more the performance of a CPU. NVIDIA GPUs in particular have benefited enormously from the training portion of machine learning, reporting a 193% Y/Y last quarter in their datacenter segment, which is now approaching a $1B run-rate business. Microsoft has recently announced that Field Programmable Gate Array (FPGA) accelerators have become pervasive in their datacenters. Soon after, announced that Baidu is using their devices for acceleration of machine learning applied to speech processing and autonomous vehicles.
This article is a summary of a three-hour discussion at Stanford University in September 2019 among the authors. It has been written with combined experiences at and with organizations such as Zilog, Altera, Xilinx, Achronix, Intel, IBM, Stanford, MIT, Berkeley, University of Wisconsin, the Technion, Fairchild, Bell Labs, Bigstream, Google, DIGITAL (DEC), SUN, Nokia, SRI, Hitachi, Silicom, Maxeler Technologies, VMware, Xerox PARC, Cisco, and many others. These organizations are not responsible for the content, but may have inspired the authors in some ways, to arrive at the colorful ride through FPGA space described here. Field-programmable gate arrays (FPGAs) have been hitting a nerve in the ASIC community since their inception. In the mid-1980s, Ross Freeman and his colleagues bought the technology from Zilog and started Xilinx, targeting the ASIC emulation and education markets.
FPGAs from Intel and Xilinx have been steadily carving out niches in datacenter applications where low power, high performance, and configurability may trump programming challenges. Xilinx and Amazon Web Services (AWS) have been working with solution providers to create shrink-wrapped applications and tools which use AWS F1 FPGA instances, and Microsoft has recently announced some pretty stellar results in its Project BrainWave AI program using Intel (Altera) FPGAs. Almost a year ago I covered the initial AWS offerings. At the time, I felt that AWS needed to go from 3 solutions to 30 to convince me and the market that there is real demand, and then to 100 to have a material impact on the market. I recently noticed that AWS is now at 20 Amazon Marketplace Instances (AMIs), so it seemed like a good time to check back in.
Microsoft has been using field-programmable gate arrays (FPGAs) to improve performance and efficiencies of Bing and Azure for the last few years. But next year, Microsoft plans to make this kind of FPGA processing power available to developers who will be able to use it to run their own tasks, including intensive artificial-intelligence ones, like deep-neural-networking (DNN). At its Build developers conference this Spring, Azure CTO Mark Russinovich outlined Microsoft's big-picture plans for delivering "Hardware Microservices" via the Azure cloud. Russinovich told attendees that once Microsoft solves some lingering security and other issues, "we will have what we consider to be a fully configurable cloud." "This is the core of an AI cloud," Russinovich said, and "a major step toward democratizing AI with the power of FPGA." (A good recap of Russinovich's remarks can be found in this TheNewStack article.)
When it comes to speeding up computationally intensive workloads, GPUs are not the only game in town. FPGAs (field-programmable gate arrays) are also gaining traction in data centers. While companies used to have to justify everything they wanted to migrate to the cloud, that scenario has flipped in recent years. Here's how to make the best decisions about cloud computing. These programmable logic devices, which can be reconfigured "in the field" for different tasks after manufacturing, have long been used in telecom gear, industrial systems, automotive, and military and aerospace applications.