xGen-small Technical Report

Nijkamp, Erik, Pang, Bo, Pakhomov, Egor, Gokul, Akash, Qu, Jin, Savarese, Silvio, Zhou, Yingbo, Xiong, Caiming

arXiv.org Artificial Intelligence 

We introduce xGen-small, a family of 4B and 9B Transformer decoder models optimized for long-context applications. Our vertically integrated pipeline unites domain-balanced, frequency-aware data curation; multi-stage pre-training with quality annealing and length extension to 128k tokens; and targeted post-training via supervised fine-tuning, preference learning, and online reinforcement learning. xGen-small delivers strong performance across various tasks, especially in math and coding domains, while excelling at long context benchmarks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found