liuetal
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > Canada > Alberta (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- North America > United States > California (0.04)
- North America > United States > Arizona (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- Asia > Middle East > Jordan (0.05)
- North America > United States (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- Africa > Ethiopia (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- (13 more...)
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
Zheng, Jinliang, Li, Jianxiong, Wang, Zhihao, Liu, Dongxiu, Kang, Xirui, Feng, Yuchun, Zheng, Yinan, Zou, Jiayin, Chen, Yilun, Zeng, Jia, Zhang, Ya-Qin, Pang, Jiangmiao, Liu, Jingjing, Wang, Tai, Zhan, Xianyuan
Successful generalist Vision-Language-Action (VLA) models rely on effective training across diverse robotic platforms with large-scale, cross-embodiment, heterogeneous datasets. To facilitate and leverage the heterogeneity in rich, diverse robotic data sources, we propose a novel Soft Prompt approach with minimally added parameters, by infusing prompt learning concepts into cross-embodiment robot learning and introducing separate sets of learnable embeddings for each distinct data source. These embeddings serve as embodiment-specific prompts, which in unity empower VLA models with effective exploitation of varying cross-embodiment features. Our new X-VLA, a neat flow-matching-based VLA architecture, relies exclusively on soft-prompted standard Transformer encoders, enjoying both scalability and simplicity. Evaluated across 6 simulations as well as 3 real-world robots, our 0.9B instantiation-X-VLA-0.9B simultaneously achieves SOTA performance over a sweep of benchmarks, demonstrating superior results on a wide axes of capabilities, from flexible dexterity to quick adaptation across embodiments, environments, and tasks. Website: https://thu-air-dream.github.io/X-VLA/
A Survey of Reinforcement Learning for Large Reasoning Models
Zhang, Kaiyan, Zuo, Yuxin, He, Bingxiang, Sun, Youbang, Liu, Runze, Jiang, Che, Fan, Yuchen, Tian, Kai, Jia, Guoli, Li, Pengfei, Fu, Yu, Lv, Xingtai, Zhang, Yuchen, Zeng, Sihang, Qu, Shang, Li, Haozhan, Wang, Shijie, Wang, Yuru, Long, Xinwei, Liu, Fangfu, Xu, Xiang, Ma, Jiaze, Zhu, Xuekai, Hua, Ermo, Liu, Yihao, Li, Zonglin, Chen, Huayu, Qu, Xiaoye, Li, Yafu, Chen, Weize, Yuan, Zhenzhao, Gao, Junqi, Li, Dong, Ma, Zhiyuan, Cui, Ganqu, Liu, Zhiyuan, Qi, Biqing, Ding, Ning, Zhou, Bowen
In this paper, we survey recent advances in Reinforcement Learning (RL) for reasoning with Large Language Models (LLMs). RL has achieved remarkable success in advancing the frontier of LLM capabilities, particularly in addressing complex logical tasks such as mathematics and coding. As a result, RL has emerged as a foundational methodology for transforming LLMs into LRMs. With the rapid progress of the field, further scaling of RL for LRMs now faces foundational challenges not only in computational resources but also in algorithm design, training data, and infrastructure. To this end, it is timely to revisit the development of this domain, reassess its trajectory, and explore strategies to enhance the scalability of RL toward Artificial SuperIntelligence (ASI). In particular, we examine research applying RL to LLMs and LRMs for reasoning abilities, especially since the release of DeepSeek-R1, including foundational components, core problems, training resources, and downstream applications, to identify future opportunities and directions for this rapidly evolving area. We hope this review will promote future research on RL for broader reasoning models. Github: https://github.com/TsinghuaC3I/Awesome-RL-for-LRMs
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (3 more...)
- Overview (1.00)
- Research Report (0.64)
- Education (1.00)
- Health & Medicine (0.92)
- Leisure & Entertainment > Games (0.67)