Speed up model inference with Vertex AI Predictions' optimized TensorFlow runtime

#artificialintelligence 

From product recommendations, to fraud detection, to route optimization, low latency predictions are vital for numerous machine learning tasks. That's why we're excited to announce a public preview for a new runtime that optimizes serving TensorFlow models on the Vertex AI Prediction service. This optimized TensorFlow runtime leverages technologies and model optimization techniques that are used internally at Google, and can be incorporated into your serving workflows without any changes to your training or model saving code. The result is faster predictions at a lower cost compared to the open source based pre-built TensorFlow serving containers. This post is a high-level overview of the optimized TensorFlow runtime that reviews some of its features, how to use it, and then provides benchmark data that demonstrates how it performs.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found