NVIDIA's meteoric growth in the datacenter, where its business is now generating some $1.6B annually, has been largely driven by the demand to train deep neural networks for Machine Learning (ML) and Artificial Intelligence (AI)--an area where the computational requirements are simply mindboggling. First, and perhaps most importantly, Huang announced new TensorRT3 software that optimizes trained neural networks for inference processing on NVIDIA GPUs. In addition to announcing the Chinese deployment wins, Huang provided some pretty compelling benchmarks to demonstrate the company's prowess in accelerating Machine Learning inference operations, in the datacenter and at the edge. In addition to the TensorRT3 deployments, Huang announced that the largest Chinese Cloud Service Providers, Alibaba, Baidu, and Tencent, are all offering the company's newest Tesla V100 GPUs to their customers for scientific and deep learning applications.
The story behind the story: a finely tuned generative adversarial network that sampled 8,000 great works of art -- a tiny sample size in the data-intensive world of deep learning -- and in just 14 hours of training on an NVIDIA DGX system created an application that takes human input and turns it into something stunning. Building on thousands of hours of research undertaken by Cambridge Consultants' AI research lab, the Digital Greenhouse, a team of five built the Vincent demo in just two months. After Huang's keynote, GTC attendees had the opportunity to pick up the stylus for themselves, selecting from one of seven different styles to sketch everything from portraits to landscapes to, of course, cats. While traditional deep learning algorithms have achieved stunning results by ingesting vast quantities of data, GANs create applications out of much smaller sample sizes by training one neural network to try to imitate the data they're fed, and another to try to spot fakes.
Nvidia wants to make it easier for automotive companies to build self-driving cars, so it's releasing a brand new supercomputer designed to drive them. The chipmaker claims its new supercomputer is the world's first artificial intelligence computer designed for "Level 5" autonomy, which means vehicles that can operate themselves without any human intervention. The new computer will be part of Nvidia's existing Drive PX platform, which the GPU-maker offers to automotive companies in order to provide the processing power for their self-driving car systems. Huang announced Nvidia will soon release a new software development kit (SDK), Drive IX, that will help developers to build new AI-partner programs to improve in-car experience.
By the middle of 2018, Nvidia believes it will have a system capable of level 5 autonomy in the hands of the auto industry, which will allow for fully self-driving vehicles. Pegasus is rated as being capable of 320 trillion operations per second, which the company claims is a thirteen-fold increase over previous generations. In May, Nvidia took the wraps off its Tesla V100 accelerator aimed at deep learning. The company said the V100 has 1.5 times the general-purpose FLOPS compared to Pascal, a 12 times improvement for deep learning training, and six times the performance for deep learning inference.
And although the pace may have slowed down, the number of transistors that could fit per square inch did continue to increase, doubling not every year but after every 18 months instead. Dennard scaling -- named after Robert H. Dennard who co-authored the concept -- states that even while transistors become smaller, power density remains constant such that power consumption remains proportional with its area. On NVIDIA's end, Huang assures that their venture into artificial intelligence and deep learning will keep them ahead even with the death of Moore's Law. That's not to say, though, that NVIDIA will stop making their GPUs more powerful.
These simulators, most recently announced by Nvidia as a project called Isaac's Lab but also pioneered by Alphabet's DeepMind and Elon Musk's OpenAI, are 3D spaces that have physics just like reality, with virtual objects that act the same way as their physical counterparts. "We imagine that one of these days, we'll be able to go into the Holodeck, design a product, design the factory that's going to make the product, and design the robots that's going to make the factory that makes the products. Alphabet's DeepMind has had similar ideas: The AI research lab is most well-known for applying its AI to games, notably AlphaGo, which continues to beat human world-champions at Go, but also building AI that beats video games like Atari and Starcraft. While Nvidia's Isaac's Lab is meant to help build robots and products that do specific tasks in the real world, DeepMind's Lab is geared more towards research, or finding ways to build AI that can learn about its surroundings with little input.
The other components of the strategy revolve around showcasing cutting-edge AI applications across sectors, building partnerships with other companies, nurturing technology start-ups and help build an AI ecosystem. TensorFlow is Google's deep learning framework while CUDA (Compute Unified Device Architecture) is a free software platform provided by Nvidia that enables users to program GPUs. Further, public cloud services providers such as Alibaba Group Holdings Ltd, Amazon Web Services, Baidu Inc., Facebook, Google, IBM, Microsoft and Tencent Holdings Ltd use Nvidia GPUs in their data centres, prompting Nvidia to launch its GPU Cloud platform, which integrates deep learning frameworks, software libraries, drivers and the operating system. Nvidia also worked with SAP SE to develop a product called Brand Impact--a fully automated and scalable video analytics service for brands, media agencies and media production companies.
The new chip has 21 billion transistors, and it is an order of magnitude more powerful than the 15-billion transistor Pascal-based processor that Nvidia announced a year ago. He noted that deep learning neural network research started to pay off about five years ago, when researchers started using graphics processing units (GPUs) to process data in parallel to train neural networks quickly. And this year, Nvidia plans to train 100,000 developers to use deep learning. Volta has 12 times the Tensor FLOPs for deep learning training compared to last year's Pascal-based processor.
One data center provider that specializes in hosting infrastructure for Deep Learning told us most of their customers hadn't yet deployed their AI applications in production. If your on-premises Deep Learning infrastructure will do a lot of training – the computationally intensive applications used to teach neural networks things like speech and image recognition – prepare for power-hungry servers with lots of GPUs on every motherboard. While not particularly difficult to handle on-premises, one big question to answer about inferencing servers for the data center manager is how close they have to be to where input data originates. If your corporate data centers are in Ashburn, Virginia, but your Machine Learning application has to provide real-time suggestions to users in Dallas or Portland, chances are you'll need some inferencing servers in or near Dallas and Portland to make it actually feel close to real-time.
"We invented a computing model called GPU accelerated computing and we introduced it almost slightly over 10 years ago," Huang said, noting that while AI is only recently dominating tech news headlines, the company was working on the foundation long before that. Nvidia's tech now resides in many of the world's most powerful supercomputers, and the applications include fields that were once considered beyond the realm of modern computing capabilities. Now, Nvidia's graphics hardware occupies a more pivotal role, according to Huang – and the company's long list of high-profile partners, including Microsoft, Facebook and others, bears him out. GTC, in other words, has evolved into arguably the biggest developer event focused on artificial intelligence in the world.