Chen, Xiaosha
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI, null, Guo, Daya, Yang, Dejian, Zhang, Haowei, Song, Junxiao, Zhang, Ruoyu, Xu, Runxin, Zhu, Qihao, Ma, Shirong, Wang, Peiyi, Bi, Xiao, Zhang, Xiaokang, Yu, Xingkai, Wu, Yu, Wu, Z. F., Gou, Zhibin, Shao, Zhihong, Li, Zhuoshu, Gao, Ziyi, Liu, Aixin, Xue, Bing, Wang, Bingxuan, Wu, Bochao, Feng, Bei, Lu, Chengda, Zhao, Chenggang, Deng, Chengqi, Zhang, Chenyu, Ruan, Chong, Dai, Damai, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Dai, Fucong, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Bao, Han, Xu, Hanwei, Wang, Haocheng, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Qu, Hui, Li, Hui, Guo, Jianzhong, Li, Jiashi, Wang, Jiawei, Chen, Jingchang, Yuan, Jingyang, Qiu, Junjie, Li, Junlong, Cai, J. L., Ni, Jiaqi, Liang, Jian, Chen, Jin, Dong, Kai, Hu, Kai, Gao, Kaige, Guan, Kang, Huang, Kexin, Yu, Kuai, Wang, Lean, Zhang, Lecong, Zhao, Liang, Wang, Litong, Zhang, Liyue, Xu, Lei, Xia, Leyi, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Meng, Wang, Miaojun, Li, Mingming, Tian, Ning, Huang, Panpan, Zhang, Peng, Wang, Qiancheng, Chen, Qinyu, Du, Qiushi, Ge, Ruiqi, Zhang, Ruisong, Pan, Ruizhe, Wang, Runji, Chen, R. J., Jin, R. L., Chen, Ruyi, Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Ye, Shengfeng, Wang, Shiyu, Yu, Shuiping, Zhou, Shunfeng, Pan, Shuting, Li, S. S., Zhou, Shuang, Wu, Shaoqing, Ye, Shengfeng, Yun, Tao, Pei, Tian, Sun, Tianyu, Wang, T., Zeng, Wangding, Zhao, Wanjia, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Yu, Wenqin, Zhang, Wentao, Xiao, W. L., An, Wei, Liu, Xiaodong, Wang, Xiaohan, Chen, Xiaokang, Nie, Xiaotao, Cheng, Xin, Liu, Xin, Xie, Xin, Liu, Xingchao, Yang, Xinyu, Li, Xinyuan, Su, Xuecheng, Lin, Xuheng, Li, X. Q., Jin, Xiangyue, Shen, Xiaojin, Chen, Xiaosha, Sun, Xiaowen, Wang, Xiaoxiang, Song, Xinnan, Zhou, Xinyi, Wang, Xianzu, Shan, Xinxia, Li, Y. K., Wang, Y. Q., Wei, Y. X., Zhang, Yang, Xu, Yanhong, Li, Yao, Zhao, Yao, Sun, Yaofeng, Wang, Yaohui, Yu, Yi, Zhang, Yichao, Shi, Yifan, Xiong, Yiliang, He, Ying, Piao, Yishi, Wang, Yisong, Tan, Yixuan, Ma, Yiyang, Liu, Yiyuan, Guo, Yongqiang, Ou, Yuan, Wang, Yuduan, Gong, Yue, Zou, Yuheng, He, Yujia, Xiong, Yunfan, Luo, Yuxiang, You, Yuxiang, Liu, Yuxuan, Zhou, Yuyang, Zhu, Y. X., Xu, Yanhong, Huang, Yanping, Li, Yaohui, Zheng, Yi, Zhu, Yuchen, Ma, Yunxian, Tang, Ying, Zha, Yukun, Yan, Yuting, Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Xu, Zhean, Xie, Zhenda, Zhang, Zhengyan, Hao, Zhewen, Ma, Zhicheng, Yan, Zhigang, Wu, Zhiyu, Gu, Zihui, Zhu, Zijia, Liu, Zijun, Li, Zilin, Xie, Ziwei, Song, Ziyang, Pan, Zizheng, Huang, Zhen, Xu, Zhipeng, Zhang, Zhongyu, Zhang, Zhen
We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters challenges such as poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. To support the research community, we open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama.
DeepSeek-V3 Technical Report
DeepSeek-AI, null, Liu, Aixin, Feng, Bei, Xue, Bing, Wang, Bingxuan, Wu, Bochao, Lu, Chengda, Zhao, Chenggang, Deng, Chengqi, Zhang, Chenyu, Ruan, Chong, Dai, Damai, Guo, Daya, Yang, Dejian, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Dai, Fucong, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Bao, Han, Xu, Hanwei, Wang, Haocheng, Zhang, Haowei, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Li, Hui, Qu, Hui, Cai, J. L., Liang, Jian, Guo, Jianzhong, Ni, Jiaqi, Li, Jiashi, Wang, Jiawei, Chen, Jin, Chen, Jingchang, Yuan, Jingyang, Qiu, Junjie, Li, Junlong, Song, Junxiao, Dong, Kai, Hu, Kai, Gao, Kaige, Guan, Kang, Huang, Kexin, Yu, Kuai, Wang, Lean, Zhang, Lecong, Xu, Lei, Xia, Leyi, Zhao, Liang, Wang, Litong, Zhang, Liyue, Li, Meng, Wang, Miaojun, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Mingming, Tian, Ning, Huang, Panpan, Wang, Peiyi, Zhang, Peng, Wang, Qiancheng, Zhu, Qihao, Chen, Qinyu, Du, Qiushi, Chen, R. J., Jin, R. L., Ge, Ruiqi, Zhang, Ruisong, Pan, Ruizhe, Wang, Runji, Xu, Runxin, Zhang, Ruoyu, Chen, Ruyi, Li, S. S., Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Wu, Shaoqing, Ye, Shengfeng, Ye, Shengfeng, Ma, Shirong, Wang, Shiyu, Zhou, Shuang, Yu, Shuiping, Zhou, Shunfeng, Pan, Shuting, Wang, T., Yun, Tao, Pei, Tian, Sun, Tianyu, Xiao, W. L., Zeng, Wangding, Zhao, Wanjia, An, Wei, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Yu, Wenqin, Zhang, Wentao, Li, X. Q., Jin, Xiangyue, Wang, Xianzu, Bi, Xiao, Liu, Xiaodong, Wang, Xiaohan, Shen, Xiaojin, Chen, Xiaokang, Zhang, Xiaokang, Chen, Xiaosha, Nie, Xiaotao, Sun, Xiaowen, Wang, Xiaoxiang, Cheng, Xin, Liu, Xin, Xie, Xin, Liu, Xingchao, Yu, Xingkai, Song, Xinnan, Shan, Xinxia, Zhou, Xinyi, Yang, Xinyu, Li, Xinyuan, Su, Xuecheng, Lin, Xuheng, Li, Y. K., Wang, Y. Q., Wei, Y. X., Zhu, Y. X., Zhang, Yang, Xu, Yanhong, Xu, Yanhong, Huang, Yanping, Li, Yao, Zhao, Yao, Sun, Yaofeng, Li, Yaohui, Wang, Yaohui, Yu, Yi, Zheng, Yi, Zhang, Yichao, Shi, Yifan, Xiong, Yiliang, He, Ying, Tang, Ying, Piao, Yishi, Wang, Yisong, Tan, Yixuan, Ma, Yiyang, Liu, Yiyuan, Guo, Yongqiang, Wu, Yu, Ou, Yuan, Zhu, Yuchen, Wang, Yuduan, Gong, Yue, Zou, Yuheng, He, Yujia, Zha, Yukun, Xiong, Yunfan, Ma, Yunxian, Yan, Yuting, Luo, Yuxiang, You, Yuxiang, Liu, Yuxuan, Zhou, Yuyang, Wu, Z. F., Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Xu, Zhean, Huang, Zhen, Zhang, Zhen, Xie, Zhenda, Zhang, Zhengyan, Hao, Zhewen, Gou, Zhibin, Ma, Zhicheng, Yan, Zhigang, Shao, Zhihong, Xu, Zhipeng, Wu, Zhiyu, Zhang, Zhongyu, Li, Zhuoshu, Gu, Zihui, Zhu, Zijia, Liu, Zijun, Li, Zilin, Xie, Ziwei, Song, Ziyang, Gao, Ziyi, Pan, Zizheng
We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for load balancing and sets a multi-token prediction training objective for stronger performance. We pre-train DeepSeek-V3 on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages to fully harness its capabilities. Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. Despite its excellent performance, DeepSeek-V3 requires only 2.788M H800 GPU hours for its full training. In addition, its training process is remarkably stable. Throughout the entire training process, we did not experience any irrecoverable loss spikes or perform any rollbacks.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek-AI, null, Liu, Aixin, Feng, Bei, Wang, Bin, Wang, Bingxuan, Liu, Bo, Zhao, Chenggang, Dengr, Chengqi, Ruan, Chong, Dai, Damai, Guo, Daya, Yang, Dejian, Chen, Deli, Ji, Dongjie, Li, Erhang, Lin, Fangyun, Luo, Fuli, Hao, Guangbo, Chen, Guanting, Li, Guowei, Zhang, H., Xu, Hanwei, Yang, Hao, Zhang, Haowei, Ding, Honghui, Xin, Huajian, Gao, Huazuo, Li, Hui, Qu, Hui, Cai, J. L., Liang, Jian, Guo, Jianzhong, Ni, Jiaqi, Li, Jiashi, Chen, Jin, Yuan, Jingyang, Qiu, Junjie, Song, Junxiao, Dong, Kai, Gao, Kaige, Guan, Kang, Wang, Lean, Zhang, Lecong, Xu, Lei, Xia, Leyi, Zhao, Liang, Zhang, Liyue, Li, Meng, Wang, Miaojun, Zhang, Mingchuan, Zhang, Minghua, Tang, Minghui, Li, Mingming, Tian, Ning, Huang, Panpan, Wang, Peiyi, Zhang, Peng, Zhu, Qihao, Chen, Qinyu, Du, Qiushi, Chen, R. J., Jin, R. L., Ge, Ruiqi, Pan, Ruizhe, Xu, Runxin, Chen, Ruyi, Li, S. S., Lu, Shanghao, Zhou, Shangyan, Chen, Shanhuang, Wu, Shaoqing, Ye, Shengfeng, Ma, Shirong, Wang, Shiyu, Zhou, Shuang, Yu, Shuiping, Zhou, Shunfeng, Zheng, Size, Wang, T., Pei, Tian, Yuan, Tian, Sun, Tianyu, Xiao, W. L., Zeng, Wangding, An, Wei, Liu, Wen, Liang, Wenfeng, Gao, Wenjun, Zhang, Wentao, Li, X. Q., Jin, Xiangyue, Wang, Xianzu, Bi, Xiao, Liu, Xiaodong, Wang, Xiaohan, Shen, Xiaojin, Chen, Xiaokang, Chen, Xiaosha, Nie, Xiaotao, Sun, Xiaowen, Wang, Xiaoxiang, Liu, Xin, Xie, Xin, Yu, Xingkai, Song, Xinnan, Zhou, Xinyi, Yang, Xinyu, Lu, Xuan, Su, Xuecheng, Wu, Y., Li, Y. K., Wei, Y. X., Zhu, Y. X., Xu, Yanhong, Huang, Yanping, Li, Yao, Zhao, Yao, Sun, Yaofeng, Li, Yaohui, Wang, Yaohui, Zheng, Yi, Zhang, Yichao, Xiong, Yiliang, Zhao, Yilong, He, Ying, Tang, Ying, Piao, Yishi, Dong, Yixin, Tan, Yixuan, Liu, Yiyuan, Wang, Yongji, Guo, Yongqiang, Zhu, Yuchen, Wang, Yuduan, Zou, Yuheng, Zha, Yukun, Ma, Yunxian, Yan, Yuting, You, Yuxiang, Liu, Yuxuan, Ren, Z. Z., Ren, Zehui, Sha, Zhangli, Fu, Zhe, Huang, Zhen, Zhang, Zhen, Xie, Zhenda, Hao, Zhewen, Shao, Zhihong, Wen, Zhiniu, Xu, Zhipeng, Zhang, Zhongyu, Li, Zhuoshu, Wang, Zihan, Gu, Zihui, Li, Zilin, Xie, Ziwei
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly compressing the Key-Value (KV) cache into a latent vector, while DeepSeekMoE enables training strong models at an economical cost through sparse computation. Compared with DeepSeek 67B, DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves 42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlock its potential. Evaluation results show that, even with only 21B activated parameters, DeepSeek-V2 and its chat versions still achieve top-tier performance among open-source models.
Deep Learning Based Intelligent Inter-Vehicle Distance Control for 6G Enabled Cooperative Autonomous Driving
Chen, Xiaosha, Leng, Supeng, He, Jianhua, Zhou, Longyu
Research on the sixth generation cellular networks (6G) is gaining huge momentum to achieve ubiquitous wireless connectivity. Connected autonomous driving (CAV) is a critical vertical envisioned for 6G, holding great potentials of improving road safety, road and energy efficiency. However the stringent service requirements of CAV applications on reliability, latency and high speed communications will present big challenges to 6G networks. New channel access algorithms and intelligent control schemes for connected vehicles are needed for 6G supported CAV. In this paper, we investigated 6G supported cooperative driving, which is an advanced driving mode through information sharing and driving coordination. Firstly we quantify the delay upper bounds of 6G vehicle to vehicle (V2V) communications with hybrid communication and channel access technologies. A deep learning neural network is developed and trained for fast computation of the delay bounds in real time operations. Then, an intelligent strategy is designed to control the inter-vehicle distance for cooperative autonomous driving. Furthermore, we propose a Markov Chain based algorithm to predict the parameters of the system states, and also a safe distance mapping method to enable smooth vehicular speed changes. The proposed algorithms are implemented in the AirSim autonomous driving platform. Simulation results show that the proposed algorithms are effective and robust with safe and stable cooperative autonomous driving, which greatly improve the road safety, capacity and efficiency.