Technical Report of TeleChat2, TeleChat2.5 and T1
Wang, Zihan, Liu, Xinzhang, Yao, Yitong, Wang, Chao, Zhao, Yu, Yang, Zhihao, Deng, Wenmin, Jia, Kaipeng, Peng, Jiaxin, Huang, Yuyao, Xiong, Sishi, Jiang, Zhuo, Yu, Kaidong, Hu, Xiaohui, Yao, Fubei, Fang, Ruiyu, Jiang, Zhuoru, Song, Ruiting, Xie, Qiyi, Xue, Rui, He, Xuewei, Xue, Yanlei, Yuan, Zhu, Zhang, Zhaoxi, Huang, Zilu, Wang, Shiquan, Wang, Xin, Wu, Hanming, Wang, Mingyuan, Zhan, Xufeng, Sun, Yuhan, Xing, Zhaohu, Jiang, Yuhao, Yang, Bingkai, Song, Shuangyong, Li, Yongxiang, He, Zhongjiang, Li, Xuelong
–arXiv.org Artificial Intelligence
We introduce the latest series of TeleChat models: \textbf{TeleChat2}, \textbf{TeleChat2.5}, and \textbf{T1}, offering a significant upgrade over their predecessor, TeleChat. Despite minimal changes to the model architecture, the new series achieves substantial performance gains through enhanced training strategies in both pre-training and post-training stages. The series begins with \textbf{TeleChat2}, which undergoes pretraining on 10 trillion high-quality and diverse tokens. This is followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to further enhance its capabilities. \textbf{TeleChat2.5} and \textbf{T1} expand the pipeline by incorporating a continual pretraining phase with domain-specific datasets, combined with reinforcement learning (RL) to improve performance in code generation and mathematical reasoning tasks. The \textbf{T1} variant is designed for complex reasoning, supporting long Chain-of-Thought (CoT) reasoning and demonstrating substantial improvements in mathematics and coding. In contrast, \textbf{TeleChat2.5} prioritizes speed, delivering rapid inference. Both flagship models of \textbf{T1} and \textbf{TeleChat2.5} are dense Transformer-based architectures with 115B parameters, showcasing significant advancements in reasoning and general task performance compared to the original TeleChat. Notably, \textbf{T1-115B} outperform proprietary models such as OpenAI's o1-mini and GPT-4o. We publicly release \textbf{TeleChat2}, \textbf{TeleChat2.5} and \textbf{T1}, including post-trained versions with 35B and 115B parameters, to empower developers and researchers with state-of-the-art language models tailored for diverse applications.
arXiv.org Artificial Intelligence
Jul-30-2025