Mesh-TensorFlow: Deep Learning for Supercomputers

Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman

Nov-20-2025, 15:48:31 GMT–Neural Information Processing Systems

All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters.

artificial intelligence, dimension, machine learning, (17 more...)

Neural Information Processing Systems

Nov-20-2025, 15:48:31 GMT

Conferences PDF

Add feedback

Country:
- North America
  - United States
    - Texas > Travis County
      - Austin (0.04)
    - Nevada > Washoe County
      - Reno (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe > Spain
  - Canary Islands (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Duplicate Docs Excel Report

Title
Mesh-TensorFlow: Deep Learning for Supercomputers
Mesh-TensorFlow: Deep Learning for Supercomputers

Similar Docs Excel Report more

Title	Similarity	Source
None found