Nvidia's GPU-powered platform for developing and running conversational AI that understands and responds to natural language requests has achieved some key milestones and broken some records that have big implications for anyone building on their tech -- which includes companies large and small, as much of the code they've used to achieve these advancements is open source, written in PyTorch and easy to run. The biggest achievements Nvidia announced today include its breaking the hour mark in training BERT, one of the world's most advanced AI language models and a state-of-the-art model widely considered a good standard for natural language processing. Nvidia's AI platform was able to train the model in less than an hour, a record-breaking achievement at just 53 minutes, and the trained model could then successfully infer (i.e. Nvidia's breakthroughs aren't just cause for bragging rights -- these advances scale and provide real-world benefits for anyone working with their NLP conversational AI and GPU hardware. Nvidia achieved its record-setting times for training on one of its SuperPOD systems, which is made up of 92 Nvidia DGX-2H systems runnings 1,472 V100 GPUs, and managed the inference on Nvidia T4 GPUs running Nvidia TensorRT -- which beat the performance of even highly optimized CPUs by many orders of magnitude.
The GPU maker says its AI platform now has the fastest training record, the fastest inference, and largest training model of its kind to date. Nvidia is touting advancements to its artificial intelligence (AI) technology for language understanding that it said sets new performance records for conversational AI. The GPU maker said its AI platform now has the fastest training record, the fastest inference, and largest training model of its kind to date. By adding key optimizations to its AI platform and GPUs, Nvidia is aiming to become the premier provider of conversational AI services, which it says have been limited up to this point due to a broad inability to deploy large AI models in real time. Unlike the much simpler transactional AI, conversational AI uses context and nuance and the responses are instantaneous, explained Nvidia's vice president of applied deep learning research, Bryan Catanzaro, on a press briefing.
Nvidia Corp. is upping its artificial intelligence game with the release of a new version of its TensorRT software platform for high-performance deep learning inference. TensorRT is a platform that combines a high-performance deep learning inference optimizer with a runtime that delivers low-latency, high-throughput inference for AI applications. Inference is an important aspect of AI. Whereas AI training relates to the development of an algorithm's ability to understand a data set, inference refers to its ability to act on that data to infer answers to specific queries. The latest version brings with it some dramatic improvements on the performance side.
NVIDIA just announced the Jetson TX2 embedded AI supercomputer, based on the latest NVIDIA Pascal microarchitecture. It promises to offer twice the performance of the previous-generation Jetson TX1, in the same package. In this tech report, we will share with you the full details of the new Pascal-based NVIDIA Jetson TX2! Artificial intelligence is the new frontier in GPU compute technology. Whether they are used to power training or inference engines, AI research has benefited greatly from the massive amounts of compute power in modern GPUs. The market is led by NVIDIA with their Tesla accelerators that run on their proprietary CUDA platform.
NVIDIA's meteoric growth in the datacenter, where its business is now generating some $1.6B annually, has been largely driven by the demand to train deep neural networks for Machine Learning (ML) and Artificial Intelligence (AI)--an area where the computational requirements are simply mindboggling. Much of this business is coming from the largest datacenters in the US, including Amazon, Google, Facebook, IBM, and Microsoft. Recently, NVIDIA announced new technology and customer initiatives at its annual Beijing GTC event to help drive revenue in the inference market for Machine Learning, as well as solidify the company's position in the huge Chinese AI market. For those unfamiliar, inference is where the trained neural network is used to predict and classify sample data. It is likely that the inference market will eventually be larger, in terms of chip unit volumes, than the training market; after all, once you train a neural network, you probably intend to use it and use it a lot.