How to Speed Up Deep Learning Inference Using TensorRT NVIDIA Developer Blog

Nov-13-2018, 02:50:53 GMT–#artificialintelligence

Welcome to this introduction to TensorRT, our platform for deep learning inference. You will learn how to deploy a deep learning application onto a GPU, increasing throughput and reducing latency during inference. TensorRT provides APIs and parsers to import trained models from all major deep learning frameworks. It then generates optimized runtime engines deployable in the datacenter as well as in automotive and embedded environments. Applications deployed on GPUs with TensorRT perform up to 40x faster than CPU-only platforms. This tutorial uses a C example to walk you through importing an ONNX model into TensorRT, applying optimizations, and generating a high-performance runtime engine for the datacenter environment.

application, artificial intelligence, machine learning, (18 more...)

#artificialintelligence

Nov-13-2018, 02:50:53 GMT

News Web Page

Add feedback

Genre:
- Instructional Material > Course Syllabus & Notes (0.68)

Industry:
- Information Technology > Hardware (0.42)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found