Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM

Jun-16-2026, 18:56:17 GMT–Neural Information Processing Systems

Training Long-Context Large Language Models (LLMs) is challenging, as hybrid training with long-context and short-context data often leads to workload imbalances. Existing works mainly use data packing to alleviate this issue, but fail to consider imbalanced attention computation and wasted communication overhead. This paper proposes Hierarchical Balance Packing (HBP), which designs a novel batch-construction method and training recipe to address those inefficiencies.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

Neural Information Processing Systems

Jun-16-2026, 18:56:17 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.28)
- North America
  - Mexico (0.28)
  - United States (0.28)

Genre:
- Research Report > Experimental Study (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.31)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found