Introducing LongCat-Flash-Thinking: A Technical Report

Meituan LongCat Team, null, Gui, Anchun, Li, Bei, Tao, Bingyang, Zhou, Bole, Chen, Borun, Zhang, Chao, Zhang, Chao, Han, Chengcheng, Yang, Chenhui, Zhang, Chi, Peng, Chong, Zhang, Chuyu, Chen, Cong, Li, Fengcun, Xu, Gang, Lin, Guoyuan, Jiang, Hao, Liang, Hao, Fu, Haomin, Ma, Haoxiang, Liu, Hong, Hao, Hongyan, Tang, Hongyin, Zang, Hongyu, Ni, Hongzhi, Su, Hui, Liu, Jiahao, Li, Jiahuan, Liu, Jialin, Zhang, Jianfei, Xu, Jianhao, Wang, Jianing, Sun, Jiaqi, Zhang, Jiaqi, Shi, Jiarong, Yang, Jiawei, Wang, Jingang, Ding, Jinrui, Kuang, Jun, Xu, Jun, He, Ke, Zhang, Kefeng, Wang, Keheng, He, Keqing, Wei, Li, Shi, Liang, Qiu, Lin, Kong, Lingbin, Liu, Lingchuan, Guo, Linsen, An, Longfei, Xia, Mai, Zhou, Meng, Zhu, Mengshen, Pei, Peng, Jia, Pengcheng, Gu, Qi, Guo, Qi, Huang, Qiong, Chen, Quan, Weng, Quanchi, Weng, Rongxiang, Shao, Ruichen, Li, Rumei, Lei, Shanglin, Du, Shuai, Liu, Shuaikang, Zhou, Shuang, Hu, Shuhao, Xu, Siyu, Gong, Songshan, Liang, Tao, Hu, Tianhao, He, Wei, Shi, Wei, Wang, Wei, Wu, Wei, Zhuo, Wei, Tang, Weifeng, Shi, Wenjie, Zhu, Wenlong, Su, Xi, Liu, Xiangcheng, Xi, Xiangyu, Huang, Xiangzhou, Liu, Xiao, Jiang, Xiaochen, Shi, Xiaowei, Shi, Xiaowen, Li, Xiaoyu, Chen, Xin, Zhao, Xinyue, Huang, Xuan, Zhang, Xuemiao, Cao, Xuezhi, Cai, Xunliang, Zhang, Yajie, Chen, Yang, Liu, Yang, Liu, Yang, Zheng, Yang, Wang, Yaoming, Huo, Yaqi, Sun, Yerui, Lu, Yifan, Li, Yiyang, Xiao, Youshao, Lei, Yuanzhe, Xie, Yuchen, Sun, Yueqing, Zhang, Yufei, Wei, Yuhuai, Qian, Yulei, Zhao, Yunke, Ding, Yuqing, Jiang, Yuwei, Yang, Zhaohua, Chen, Zhengyu, Liu, Zhijian, Xia, Zhikang, Su, Zhongda, Li, Ziran, Wang, Ziwen, Zhuang, Ziyuan, Wang, Zongyu, Yang, Zunyuan

arXiv.org Artificial Intelligence 

We present LongCat-Flash-Thinking, an efficient 560-billion-parameter open-source Mixture-of-Experts (MoE) reasoning model. Its advanced capabilities are cultivated through a meticulously crafted training process, beginning with long Chain-of-Thought (CoT) data cold-start and culminating in large-scale Reinforcement Learning (RL). We first employ a well-designed cold-start training strategy, which significantly enhances the reasoning potential and equips the model with specialized skills in both formal and agentic reasoning. Then, a core innovation is our domain-parallel training scheme, which decouples optimization across distinct domains (e.g., STEM, Code, Agentic) and subsequently fuses the resulting expert models into a single, nearly Pareto-optimal model. This entire process is powered by our Dynamic ORchestration for Asynchronous rollout (DORA) system, a large-scale RL framework that delivers a greater than threefold training speedup over synchronous methods on tens of thousands of accelerators. As a result, LongCat-Flash-Thinking achieves state-of-the-art performance among open-source models on a suite of complex reasoning tasks. The model exhibits exceptional efficiency in agentic reasoning, reducing average token consumption by 64.5% (from 19, 653 to 6, 965) on AIME-25, without degrading task accuracy. We release LongCat-Flash-Thinking to promote further advances in reasoning systems and agentic AI research.