AITopics | Chu, Weiwei

Plotting

Chu, Weiwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training

Wang, Zheng, Cai, Anna, Xie, Xinfeng, Pan, Zaifeng, Guan, Yue, Chu, Weiwei, Wang, Jie, Li, Shikai, Huang, Jianyu, Cai, Chris, Hao, Yuchen, Ding, Yufei

arXiv.org Artificial IntelligenceMar-22-2025

In this work, we present WLB-LLM, a workLoad-balanced 4D parallelism for large language model training. We first thoroughly analyze the workload imbalance issue in LLM training and identify two primary sources of imbalance at the pipeline parallelism and context parallelism levels. Then, to address the imbalance issue, at the pipeline parallelism level, WLB-LLM incorporates a workload-aware variable-length document packing method to balance the computation and communication workload across micro-batches. Additionally, at the context parallelism level, WLB-LLM introduces a novel fine-grained per-document sharding strategy, ensuring each worker within a context parallelism group has an identical workload. Comprehensive experiments under different model scales demonstrate that WLB-LLM significantly mitigates the workload imbalance during 4D parallelism LLM training and achieves an average speedup of 1.23x when applying WLB-LLM in our internal LLM training framework.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.17924

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

Wen, Wei, Liu, Kuang-Hung, Fedorov, Igor, Zhang, Xin, Yin, Hang, Chu, Weiwei, Hassani, Kaveh, Sun, Mengying, Liu, Jiang, Wang, Xu, Jiang, Lin, Chen, Yuxin, Zhang, Buyun, Liu, Xi, Cheng, Dehua, Chen, Zhengxing, Zhao, Guang, Han, Fangqiu, Yang, Jiyan, Hao, Yuchen, Xiong, Liang, Chen, Wen-Yen

arXiv.org Artificial IntelligenceNov-13-2023

Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1) scale - Meta ranking systems serve billions of users, (2) strong baselines - the baselines are production models optimized by hundreds to thousands of world-class engineers for years since the rise of deep learning, (3) dynamic baselines - engineers may have established new and stronger baselines during NAS search, and (4) efficiency - the search pipeline must yield results quickly in alignment with the productionization life cycle. In this paper, we present Rankitect, a NAS software framework for ranking systems at Meta. Rankitect seeks to build brand new architectures by composing low level building blocks from scratch. Rankitect implements and improves state-of-the-art (SOTA) NAS methods for comprehensive and fair comparison under the same search space, including sampling-based NAS, one-shot NAS, and Differentiable NAS (DNAS). We evaluate Rankitect by comparing to multiple production ranking models at Meta. We find that Rankitect can discover new models from scratch achieving competitive tradeoff between Normalized Entropy loss and FLOPs. When utilizing search space designed by engineers, Rankitect can generate better models than engineers, achieving positive offline evaluation and online A/B test at Meta scale.

artificial intelligence, machine learning, supernet, (14 more...)

arXiv.org Artificial Intelligence

2311.0843

Country: North America > United States > Virginia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback