Deploying Deep Neural Networks with NVIDIA TensorRT

#artificialintelligence 

NVIDIA TensorRT is a high-performance deep learning inference library for production environments. Power efficiency and speed of response are two key metrics for deployed deep learning applications, because they directly affect the user experience and the cost of the service provided. Tensor RT automatically optimizes trained neural networks for run-time performance, delivering up to 16x higher energy efficiency (performance per watt) on a Tesla P100 GPU compared to common CPU-only deep learning inference systems (see Figure 1). Figure 2 shows the performance of NVIDIA Tesla P100 and K80 running inference using TensorRT with the relatively complex GoogLenet neural network architecture. In this post we will show you how you can use Tensor RT to get the best efficiency and performance out of your trained deep neural network on a GPU-based deployment platform.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found