Microsoft's ZeRO-Infinity Library Trains 32 Trillion Parameter AI Model

Jun-25-2021, 03:10:47 GMT–#artificialintelligence

Microsoft recently announced ZeRO-Infinity, an addition to their open-source DeepSpeed AI training library that optimizes memory use for training very large deep-learning models. Using ZeRO-Infinity, Microsoft trained a model with 32 trillion parameters on a cluster of 32 GPUs, and demonstrated fine-tuning of a 1 trillion parameter model on a single GPU. The DeepSpeed team described the new features in a recent blog post. ZeRO-Infinity is the latest iteration of the Zero Redundancy Optimizer (ZeRO) family of memory optimization techniques. ZeRO-Infinity introduces several new strategies for addressing memory and bandwidth constraints when training large deep-learning models, including: a new offload engine for exploiting CPU and Non-Volatile Memory express (NVMe) memory, memory-centric tiling to handle large operators without model-parallelism, bandwidth-centric partitioning for reducing bandwidth costs, and an overlap-centric design for scheduling data communication.

trillion parameter ai model, zero-infinity, zero-infinity library train 32, (10 more...)

#artificialintelligence

Jun-25-2021, 03:10:47 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found