Technical Report of TeleChat2, TeleChat2.5 and T1

Wang, Zihan, Liu, Xinzhang, Yao, Yitong, Wang, Chao, Zhao, Yu, Yang, Zhihao, Deng, Wenmin, Jia, Kaipeng, Peng, Jiaxin, Huang, Yuyao, Xiong, Sishi, Jiang, Zhuo, Yu, Kaidong, Hu, Xiaohui, Yao, Fubei, Fang, Ruiyu, Jiang, Zhuoru, Song, Ruiting, Xie, Qiyi, Xue, Rui, He, Xuewei, Xue, Yanlei, Yuan, Zhu, Zhang, Zhaoxi, Huang, Zilu, Wang, Shiquan, Wang, Xin, Wu, Hanming, Wang, Mingyuan, Zhan, Xufeng, Sun, Yuhan, Xing, Zhaohu, Jiang, Yuhao, Yang, Bingkai, Song, Shuangyong, Li, Yongxiang, He, Zhongjiang, Li, Xuelong

Jul-30-2025–arXiv.org Artificial Intelligence

We introduce the latest series of TeleChat models: \textbf{TeleChat2}, \textbf{TeleChat2.5}, and \textbf{T1}, offering a significant upgrade over their predecessor, TeleChat. Despite minimal changes to the model architecture, the new series achieves substantial performance gains through enhanced training strategies in both pre-training and post-training stages. The series begins with \textbf{TeleChat2}, which undergoes pretraining on 10 trillion high-quality and diverse tokens. This is followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to further enhance its capabilities. \textbf{TeleChat2.5} and \textbf{T1} expand the pipeline by incorporating a continual pretraining phase with domain-specific datasets, combined with reinforcement learning (RL) to improve performance in code generation and mathematical reasoning tasks. The \textbf{T1} variant is designed for complex reasoning, supporting long Chain-of-Thought (CoT) reasoning and demonstrating substantial improvements in mathematics and coding. In contrast, \textbf{TeleChat2.5} prioritizes speed, delivering rapid inference. Both flagship models of \textbf{T1} and \textbf{TeleChat2.5} are dense Transformer-based architectures with 115B parameters, showcasing significant advancements in reasoning and general task performance compared to the original TeleChat. Notably, \textbf{T1-115B} outperform proprietary models such as OpenAI's o1-mini and GPT-4o. We publicly release \textbf{TeleChat2}, \textbf{TeleChat2.5} and \textbf{T1}, including post-trained versions with 35B and 115B parameters, to empower developers and researchers with state-of-the-art language models tailored for diverse applications.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Jul-30-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.66)
- Instructional Material (0.65)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found