AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness
–Neural Information Processing Systems
Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks. However, training big models requires strong distributed system expertise to carefully design model-parallel execution strategies that suit the model architectures and cluster setups.
Neural Information Processing Systems
Oct-3-2025, 05:42:27 GMT