FlexShard: Flexible Sharding for Industry-Scale Sequence Recommendation Models

Sethi, Geet, Bhattacharya, Pallab, Choudhary, Dhruv, Wu, Carole-Jean, Kozyrakis, Christos

Jan-7-2023–arXiv.org Artificial Intelligence

Sequence-based deep learning recommendation models (DLRMs) are an emerging class of DLRMs showing great improvements over their prior sum-pooling based counterparts at capturing users' long term interests. These improvements come at immense system cost however, with sequence-based DLRMs requiring substantial amounts of data to be dynamically materialized and communicated by each accelerator during a single iteration. To address this rapidly growing bottleneck, we present FlexShard, a new tiered sequence embedding table sharding algorithm which operates at a per-row granularity by exploiting the insight that not every row is equal. Through precise replication of embedding rows based on their underlying probability distribution, along with the introduction of a new sharding strategy adapted to the heterogeneous, skewed performance of real-world cluster network topologies, FlexShard is able to significantly reduce communication demand while using no additional memory compared to the prior state-of-the-art. When evaluated on production-scale sequence DLRMs, FlexShard was able to reduce overall global all-to-all communication traffic by over 85%, resulting in end-to-end training communication latency improvements of nearly 6x over the prior state-of-the-art approach.

flexshard, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Jan-7-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.46)
  - Minnesota (0.28)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Information Technology > Services (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.68)
    - Natural Language (1.00)
    - Representation & Reasoning (1.00)
  - Communications (1.00)
  - Information Management (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found