Microsoft AI Releases 'DeepSpeed Compression': A Python-based Composable Library for Extreme Compression and Zero-Cost Quantization to Make Deep Learning Model Size Smaller and Inference Speed Faster

Open in new window