NVIDIA's meteoric growth in the datacenter, where its business is now generating some $1.6B annually, has been largely driven by the demand to train deep neural networks for Machine Learning (ML) and Artificial Intelligence (AI)--an area where the computational requirements are simply mindboggling. Much of this business is coming from the largest datacenters in the US, including Amazon, Google, Facebook, IBM, and Microsoft. Recently, NVIDIA announced new technology and customer initiatives at its annual Beijing GTC event to help drive revenue in the inference market for Machine Learning, as well as solidify the company's position in the huge Chinese AI market. For those unfamiliar, inference is where the trained neural network is used to predict and classify sample data. It is likely that the inference market will eventually be larger, in terms of chip unit volumes, than the training market; after all, once you train a neural network, you probably intend to use it and use it a lot.
At this year's GPU Technology Conference, Nvidia's premier conference for technical computing with graphic processors, the company reserved the top keynote for its CEO Jensen Huang. Over the years, the GTC conference went from a segment in a larger, mostly gaming-oriented and somewhat scattershot conference called "nVision" to become one of the key conferences that mixes academic and commercial high-performance computing. Jensen's message was that GPU-accelerated machine learning is growing to touch every aspect of computing. While it's becoming easier to use neural nets, the technology still has a way to go to reach a broader audience. It's a hard problem, but Nvidia likes to tackle hard problems.
I started out writing a single blog on the coming year's expected AI chips, and how NVIDIA might respond to the challenges, but I quickly realized it was going to be much longer than expected. Since there is so much ground to cover, I've decided to structure this as three hopefully more consumable articles. I've included links to previous missives for those wanting to dig a little deeper. In the last five years, NVIDIA grew its data center business into a multi-billion-dollar juggernaut without once facing a single credible competitor. This is an amazing fact, and one that is unparalleled in today's technology world, to my recollection.
Nvidia on Thursday announced a bevy of new products and company updates via a virtual GTC Technology Conference keynote address from founder and CEO Jensen Huang. Key updates include the launch of the A100, Nvidia's 8th generation GPU design and its first based on Ampere architecture. Nvidia said the A100 represents the biggest generational leap ever for one of its GPUs. Designed for data centers, the multi-instance GPU is optimized for HPC and inference, delivering 20x speed improvements over Volta, with more than 54 billion transistors and third-generation Tensor Cores. With the A100, Nvidia said machines will be capable of processing massive amounts of data very quickly, and the servers will be more flexible.
Embedded AI can transform a tabletop speaker into a personal assistant; give a robot brains and dexterity; and turn a smartphone into a smart camera, music player, or game console. Traditional processors, however, lack the computational power to support many of these intelligent features. Chipmakers, startups, and capital are taking this opportunity to the market. According to a Gartner report, the chip market's total revenue hit US$400 billion in 2017, and the figure is expected to exceed US$459 billion in 2018. Traditional chip makers are putting an increasing focus on AI chip development, venture capital is pumping significant investments into the market, and AI chip startups are emerging.