NVIDIA's meteoric growth in the datacenter, where its business is now generating some $1.6B annually, has been largely driven by the demand to train deep neural networks for Machine Learning (ML) and Artificial Intelligence (AI)--an area where the computational requirements are simply mindboggling. First, and perhaps most importantly, Huang announced new TensorRT3 software that optimizes trained neural networks for inference processing on NVIDIA GPUs. In addition to announcing the Chinese deployment wins, Huang provided some pretty compelling benchmarks to demonstrate the company's prowess in accelerating Machine Learning inference operations, in the datacenter and at the edge. In addition to the TensorRT3 deployments, Huang announced that the largest Chinese Cloud Service Providers, Alibaba, Baidu, and Tencent, are all offering the company's newest Tesla V100 GPUs to their customers for scientific and deep learning applications.
Founded by Massachusetts General Hospital and later joined by Brigham & Women's Hospital, CCDS today announced it has received what it calls a purpose-built AI supercomputer from the portfolio of Nvidia DGX systems with Volta, said by Nvidia to be the biggest GPU on the market. Later this month, CCDS will also receive a DGX Station, which Nvidia calls "a personal AI supercomputer," that the organization will use to develop new training algorithms "and bring the power of AI directly to doctors" in the form of a desk-side system. With 640 Tensor Cores (8 per SM), the Tesla V100 delivers 120 teraflops of deep learning performance, providing 6-12 times higher peak teraflops for Tensor operations compared with previous-generation silicon, according to Nvidia. Nvidia said the new DGX-1 with Volta delivers AI computing performance three times faster than the prior DGX generation, providing the performance of up to 800 CPUs in a single system.
On Tuesday, NVIDIA unveiled the world's first artificial intelligence (AI) computer designed to drive fully autonomous vehicles by mid-2018. The new system, named Pegasus, extends the NVIDIA Drive PX AI computing platform to operate vehicles with Level 5 autonomy--without steering wheels, pedals, or mirrors. New types of cars will be invented, resembling offices, living rooms or hotel rooms on wheels. "The company hasn't claimed to have developed all the software, hardware, and data needed for automated driving; it's merely announced that it plans to market a chip that in theory could support the hardware and software envisioned for such a system," Walker Smith said.
September 28, 2017 -- Cirrascale Cloud Services, a premier provider of multi-GPU deep learning cloud solutions, today announced it will begin offering NVIDIA Tesla V100 GPU accelerators as part of its dedicated, multi-GPU deep learning cloud service offerings. The Tesla V100 specifications are impressive with 16GB of HBM2 stacked memory, 5,120 CUDA cores and 640 Tensor Cores, providing 7.8 TFlops double-precision performance, 15.7 TFlops single-precision performance, and 125 TFlops mixed-precision deep learning performance. "Deploying the new NVIDIA Tesla V100 GPU accelerators within the Cirrascale Cloud Services platform will enable their customers to accelerate deep learning and HPC applications using the world's most advanced data center GPUs." To learn more about Cirrascale Cloud Services and its unique dedicated, multi-GPU cloud solutions, please visit http://www.cirrascale.cloud Cirrascale Cloud Services, Cirrascale and the Cirrascale Cloud Services logo are trademarks or registered trademarks of Cirrascale Cloud Services LLC.
I immediately pulled a container and started work on a CNTK NCCL project, the next day pulled another container to work on a TF biomedical project. By running Nvidia Optix 5.0 on a DGX Station, content creators can significantly accelerate training, inference and rendering (meaning both AI and graphics tasks). Flexibility to do AI work at the desk, data center, or edge The Fastest Personal Supercomputer for Researchers and Data Scientists 15. www.nvidia.com/dgx-station However, for our current projects we need a compute server that we have exclusive access to." By running Nvidia Optix 5.0 on a DGX Station, content creators can significantly accelerate training, inference and rendering (meaning both AI and graphics tasks).
Running OptiX 5.0 on the NVIDIA DGX Station -- the company's recently introduced deskside AI workstation -- will give designers, artists and other content-creation professionals the rendering capability of 150 standard CPU-based servers. OptiX 5.0's new ray tracing capabilities will speed up the process required to visualize designs or characters, dramatically increasing a creative professional's ability to interact with their content. By running NVIDIA OptiX 5.0 on a DGX Station, content creators can significantly accelerate training, inference and rendering. To achieve equivalent rendering performance of a DGX Station, content creators would need access to a render farm with more than 150 servers that require some 200 kilowatts of power, compared with 1.5 kilowatts for a DGX Station.
With a good, solid GPU, one can quickly iterate over deep learning networks, and run experiments in days instead of months, hours instead of days, minutes instead of hours. Later I ventured further down the road and I developed a new 8-bit compression technique which enables you to parallelize dense or fully connected layers much more efficiently with model parallelism compared to 32-bit methods. For example if you have differently sized fully connected layers, or dropout layers the Xeon Phi is slower than the CPU. GPUs excel at problems that involve large amounts of memory due to their memory bandwidth.
It's great to see the two leading teams in AI computing race while we collaborate deeply across the board – tuning TensorFlow performance, and accelerating the Google cloud with NVIDIA CUDA GPUs. Dennard scaling, whereby reducing transistor size and voltage allowed designers to increase transistor density and speed while maintaining power density, is now limited by device physics. Such leaps in performance have drawn innovators from every industry, with the number of startups building GPU-driven AI services growing more than 4x over the past year to 1,300. Just as convolutional neural networks gave us the computer vision breakthrough needed to tackle self-driving cars, reinforcement learning and imitation learning may be the breakthroughs we need to tackle robotics.
Intel's Nervana based chip will likely clock in at 30 teraflops by mid-2017. In late breaking news, AMD has revealed its new AMD Instinct line of Deep Learning accelerators. In the old days, we had monolithic DL systems with single analytic objective functions. Deep Learning systems and Unsupervised Learning systems are likely these new kinds of things that we have never encountered before.
Running OptiX 5.0 on the NVIDIA DGX Station -- the company's recently introduced deskside AI workstation -- will give designers, artists and other content-creation professionals the rendering capability of 150 standard CPU-based servers. To achieve equivalent rendering performance of a DGX Station, content creators would need access to a render farm with more than 150 servers that require some 200 kilowatts of power, compared with 1.5 kilowatts for a DGX Station. Certain statements in this press release including, but not limited to, statements as to: the impact, benefits, performance and availability of NVIDIA OptiX 5.0 SDK and the NVIDIA DGX Station; AI transforming industries and having the potential to turbocharge the creative process are forward-looking statements that are subject to risks and uncertainties that could cause results to be materially different than expectations. Important factors that could cause actual results to differ materially include: global economic conditions; our reliance on third parties to manufacture, assemble, package and test our products; the impact of technological development and competition; development of new products and technologies or enhancements to our existing product and technologies; market acceptance of our products or our partners' products; design, manufacturing or software defects; changes in consumer preferences or demands; changes in industry standards and interfaces; unexpected loss of performance of our products or technologies when integrated into systems; as well as other factors detailed from time to time in the reports NVIDIA files with the Securities and Exchange Commission, or SEC, including its Form 10-Q for the fiscal period ended April 30, 2017.