Depending on your point of view, the last two years have either gone by very slowly, or very quickly. While the COVID pandemic never seemed to end – and technically still hasn't – the last two years have whizzed by for the tech industry, and especially for NVIIDA. The company launched its Ampere GPU architecture just two years ago at GTC 2020, and after selling more of their chips than ever before, now in 2022 it's already time to introduce the next architecture. So without further ado, let's talk about the Hopper architecture, which will underpin the next generation of NVIDIA server GPUs. As has become a ritual now for NVIDIA, the company is using its Spring GTC event to launch its next generation GPU architecture. Introduced just two years ago, Ampere has been NVIDIA's most successful server GPU architecture to date, with over $10B in data center sales in just the last year.
Nvidia's latest game-ready driver includes a tool that could let you improve the image quality of games that your graphics card can easily run, alongside optimizations for the new God of War PC port. The tech is called Deep Learning Dynamic Super Resolution, or DLDSR, and Nvidia says you can use it to make "most games" look sharper by running them at a higher resolution than your monitor natively supports. DLDSR builds on Nvidia's Dynamic Super Resolution tech, which has been around for years. Essentially, regular old DSR renders a game at a higher resolution than your monitor can handle and then downscales it to your monitor's native resolution. This leads to an image with better sharpness but usually comes with a dip in performance (you are asking your GPU to do more work, after all). So, for instance, if you had a graphics card capable of running a game at 4K but only had a 1440p monitor, you could use DSR to get a boost in clarity.
NetApp and Nvidia have introduced a combined AI reference architecture system to rival the Pure Storage-Nvidia AIRI system. It is aimed at deep learning and, unlike FlexPod (Cisco and NetApp's converge infrastructure), has no brand name. Unlike AIRI, neither does it have its own enclosure. A NetApp and Nvidia technical whitepaper – Scalable AI Infrastructure Designing For Real-World Deep Learning Use Cases (PDF) – defines a reference architecture (RA) for a NetApp A800 all-flash storage array and Nvidia DGX-1 GPU server system. There is a slower and less expensive A700 array-based RA.
Seven long months after the next-generation "Volta" graphics architecture debuted in the Tesla V100 for data centers, the Nvidia Titan V finally brings the bleeding-edge tech to PCs in traditional graphics card form. But make no mistake: This golden-clad monster targets data scientists, with a tensor core-laden hardware configuration designed to optimize deep learning tasks. You won't want to buy this $3,000 GPU to play Destiny 2.
If you've been waiting for NVIDIA to finally take the lid off of Volta, the next generation of its GPU technology, your day has finally come. Today at its GPU Technology Conference, the company announced the NVIDIA Tesla V100 data center GPU, the first processor to use its seventh-generation architecture. Like the Tesla P100 the processor is replacing, the Volta-powered GPU is designed specifically to power artificial intelligence and deep learning so, naturally, it's flush with power. Built on a 12nm process, the V100 boasts 5,120 CUDA Cores, 16GB of HBM2 memory, an updated NVLink 2.0 interface and is capable of a staggering 15 teraflops of computational power. Naturally, it's also the GPU that drives the company's updated DGX-1 supercomputer, too.
Super Micro Computer, Inc. (SMCI), a global leader in compute, storage, networking technologies and green computing today announced the general availability of its SuperServer solutions optimized for NVIDIA Tesla P100 accelerators with the new Pascal GPU architecture. Supermicro's innovative and GPU optimized single root complex PCI-E design is proven to dramatically improve GPU peer-to-peer communication efficiency over QPI and PCI-E links, with up to 21% higher QPI throughput and 60% lower latency compared to previous generation products. "Our high-performance computing solutions enable deep learning, engineering, and scientific fields to scale out their compute clusters to accelerate their most demanding workloads and achieve fastest time-to-results with maximum performance-per-watt, per-square-foot, and per-dollar," said Charles Liang, President and CEO of Supermicro. "With our latest innovations incorporating the new NVIDIA P100 GPUs, our customers can accelerate their applications and innovations to solve the most complex real world problems." "Supermicro's new high-density servers are optimized to fully leverage the new NVIDIA Tesla P100 accelerators to provide enterprise and HPC customers with an entirely new level of computing horsepower," said Ian Buck, General Manager of the Accelerated Computing Group at NVIDIA.
It's a conundrum: You've got deep learning software, which benefits greatly from GPU acceleration, wrapped up in a Docker container and ready to go across thousands of nodes. But wait -- apps in Docker containers can't access the GPU because they're, well, containerized. Nvidia, developer of the CUDA standard for GPU-accelerated programming, is releasing a plugin for the Docker ecosystem that makes GPU-accelerated computing possible in containers. With the plugin, applications running in a Docker container get controlled access to the GPU on the underlying hardware via Docker's own plug-in system. As Nvidia notes in a blog post, one of the early ways developers tried to work around the problem was to install Nvidia's GPU drivers inside the container and map them to the drivers on the outside.
The revolution in GPU computing started with games, and spread to the HPC centers of the world eight years ago with the first "Fermi" Tesla accelerators from Nvidia. But hyperscalers and their deep learning algorithms are driving the architecture of the "Pascal" GPUs and the Tesla accelerators that Nvidia unveiled today at the GPU Technical Conference in its hometown of San Jose. Not only did the hyperscalers and their AI efforts help drive the Pascal architecture, but they will be the first companies to get their hands on all of the Tesla P100 accelerators based on the Pascal GP100 GPU that Nvidia can manufacture, long before they become generally available in early 2017 through server partners who make hybrid CPU-GPU systems. As was the case with the prior generations of GPU compute engines, Nvidia will eventually offer multiple versions of the Pascal GPU for specific workloads and use cases, but Nvidia has made the big bet and created its high-end GP100 variant of Pascal and making other big bets at the same time, such as moving to a 16 nanometer FinFET process from chip fab partner Taiwan Semiconductor Manufacturing Corp and adding in High Bandwidth Memory from memory partner Samsung at the same time. Jen-Hsun Huang, co-founder and CEO at Nvidia, said during his opening keynote that Nvidia has a rule about how many big bets it can make.
Each of those SMs also contains 32 FP64 CUDA cores, giving us the 1/2 rate for FP64 and new to the Pascal architecture is the ability to pack 2 FP16 operations inside a single FP32 CUDA core, when under the right conditions.Nvidia has just announced a new GPU platform called the Tesla P100. NVIDIA NVLink for maximum application scalability – The NVIDIA NVLink high-speed GPU interconnect scales applications across multiple GPUs, delivering a 5x acceleration in bandwidth compared to today's best-in-class solution.Specifically, the DGX-1 can pump out 170 teraflops – that's 170,000 floating operations per second – with its eight 16GB Tesla P100 graphics chips.The hype for the upcoming next generation NVIDIA GeForce Pascal graphics processing units is now at an all-time high as majority of the reports are claiming that the GPUs might be making their way into the market in the coming months. The GPU is meant for data centers, scientific and technical research, or churning statistics. It features NVIDIA's new Pascal GPU architecture, the latest memory and semiconductor process, and packaging technology – all to create the densest compute platform to date.Other technologies employed in the DGX-1 include 16nm FinFET fabrication technology, for improved energy efficiency; Chip on Wafer on Substrate with HBM2, for maximizing big data workloads; and new half-precision instructions to deliver more than 21 teraflops of peak performance for deep learning.Nvidia's new GPU is the first to be based on its Pascal architecture. It provides the throughput of 250 CPU-based servers, networking, cables and racks – all in a single box.