Microsoft AI Releases 'DeepSpeed Compression': A Python-based Composable Library for Extreme Compression and Zero-Cost Quantization to Make Deep Learning Model Size Smaller and Inference Speed Faster

Jul-26-2022, 20:46:41 GMT–#artificialintelligence

Research in deep learning and AI is being revolutionized by large-scale models, which has resulted in significant advancements in numerous areas, including multilingual translation, creative text generation, and language interpretation. Nevertheless, the models' vast size results in latency and cost limits that make installing applications on top of them difficult, despite their impressive capabilities. The DeepSpeed team at Microsoft AI has been investigating system optimization and model compression advancements to meet these deployment problems. The DeepSpeed inference system was previously made available by the researchers as part of the Scale initiative. This system uses a variety of optimizations to increase the speed of model inference, such as highly optimized CUDA kernels and inference-adapted parallelism.

compression, deepspeed compression, system optimization, (7 more...)

#artificialintelligence

Jul-26-2022, 20:46:41 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found