Goto

Collaborating Authors

 trillion parameter ai model


Google Built A Trillion Parameter AI Model. 7 Things You Should Know

#artificialintelligence

One of the exciting things about Artificial Intelligence is the steady stream of new accomplishments that we see in the news. Every week, some research institution or company accomplishes something amazing with AI, whether it is translating a long lost language, or building a massive model, the scale of which has never been done before. But what does it all mean? If I am a business CEO, what impact if any does this have on my business? Is there any way I can leverage it?


Microsoft's ZeRO-Infinity Library Trains 32 Trillion Parameter AI Model

#artificialintelligence

Microsoft recently announced ZeRO-Infinity, an addition to their open-source DeepSpeed AI training library that optimizes memory use for training very large deep-learning models. Using ZeRO-Infinity, Microsoft trained a model with 32 trillion parameters on a cluster of 32 GPUs, and demonstrated fine-tuning of a 1 trillion parameter model on a single GPU. The DeepSpeed team described the new features in a recent blog post. ZeRO-Infinity is the latest iteration of the Zero Redundancy Optimizer (ZeRO) family of memory optimization techniques. ZeRO-Infinity introduces several new strategies for addressing memory and bandwidth constraints when training large deep-learning models, including: a new offload engine for exploiting CPU and Non-Volatile Memory express (NVMe) memory, memory-centric tiling to handle large operators without model-parallelism, bandwidth-centric partitioning for reducing bandwidth costs, and an overlap-centric design for scheduling data communication.