Goto

Collaborating Authors

 pr2



SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models

Li, Meng, Zhao, Zhen, Che, Zhengping, Liao, Fei, Wu, Kun, Xu, Zhiyuan, Ren, Pei, Jin, Zhao, Liu, Ning, Tang, Jian

arXiv.org Artificial Intelligence

Robots deployed in dynamic environments must be able to not only follow diverse language instructions but flexibly adapt when user intent changes mid-execution. While recent Vision-Language-Action (VLA) models have advanced multi-task learning and instruction following, they typically assume static task intent, failing to respond when new instructions arrive during ongoing execution. This limitation hinders natural and robust interaction in dynamic settings, such as retail or household environments, where real-time intent changes are common. We propose SwitchVLA, a unified, execution-aware framework that enables smooth and reactive task switching without external planners or additional switch-specific data. We model task switching as a behavior modulation problem conditioned on execution state and instruction context. Expert demonstrations are segmented into temporally grounded contact phases, allowing the policy to infer task progress and adjust its behavior accordingly. A multi-behavior conditional policy is then trained to generate flexible action chunks under varying behavior modes through conditioned trajectory modeling. Experiments in both simulation and real-world robotic manipulation demonstrate that SwitchVLA enables robust instruction adherence, fluid task switching, and strong generalization-outperforming prior VLA baselines in both task success rate and interaction naturalness.


PR2: A Physics- and Photo-realistic Testbed for Embodied AI and Humanoid Robots

Liu, Hangxin, Xie, Qi, Zhang, Zeyu, Yuan, Tao, Leng, Xiaokun, Sun, Lining, Zhu, Song-Chun, Zhang, Jingwen, He, Zhicheng, Su, Yao

arXiv.org Artificial Intelligence

This paper presents the development of a Physics-realistic and Photo-\underline{r}ealistic humanoid robot testbed, PR2, to facilitate collaborative research between Embodied Artificial Intelligence (Embodied AI) and robotics. PR2 offers high-quality scene rendering and robot dynamic simulation, enabling (i) the creation of diverse scenes using various digital assets, (ii) the integration of advanced perception or foundation models, and (iii) the implementation of planning and control algorithms for dynamic humanoid robot behaviors based on environmental feedback. The beta version of PR2 has been deployed for the simulation track of a nationwide full-size humanoid robot competition for college students, attracting 137 teams and over 400 participants within four months. This competition covered traditional tasks in bipedal walking, as well as novel challenges in loco-manipulation and language-instruction-based object search, marking a first for public college robotics competitions. A retrospective analysis of the competition suggests that future events should emphasize the integration of locomotion with manipulation and perception. By making the PR2 testbed publicly available at https://github.com/pr2-humanoid/PR2-Platform, we aim to further advance education and training in humanoid robotics.


Adaptive Robotic Tool-Tip Control Learning Considering Online Changes in Grasping State

Kawaharazuka, Kento, Okada, Kei, Inaba, Masayuki

arXiv.org Artificial Intelligence

Various robotic tool manipulation methods have been developed so far. However, to our knowledge, none of them have taken into account the fact that the grasping state such as grasping position and tool angle can change at any time during the tool manipulation. In addition, there are few studies that can handle deformable tools. In this study, we develop a method for estimating the position of a tool-tip, controlling the tool-tip, and handling online adaptation to changes in the relationship between the body and the tool, using a neural network including parametric bias. We demonstrate the effectiveness of our method for online change in grasping state and for deformable tools, in experiments using two different types of robots: axis-driven robot PR2 and tendon-driven robot MusashiLarm.


PRP Rebooted: Advancing the State of the Art in FOND Planning

Muise, Christian, McIlraith, Sheila A., Beck, J. Christopher

arXiv.org Artificial Intelligence

Fully Observable Non-Deterministic (FOND) planning is a variant of classical symbolic planning in which actions are nondeterministic, with an action's outcome known only upon execution. It is a popular planning paradigm with applications ranging from robot planning to dialogue-agent design and reactive synthesis. Over the last 20 years, a number of approaches to FOND planning have emerged. In this work, we establish a new state of the art, following in the footsteps of some of the most powerful FOND planners to date. Our planner, PR2, decisively outperforms the four leading FOND planners, at times by a large margin, in 17 of 18 domains that represent a comprehensive benchmark suite. Ablation studies demonstrate the impact of various techniques we introduce, with the largest improvement coming from our novel FOND-aware heuristic.


N$^2$M$^2$: Learning Navigation for Arbitrary Mobile Manipulation Motions in Unseen and Dynamic Environments

Honerkamp, Daniel, Welschehold, Tim, Valada, Abhinav

arXiv.org Artificial Intelligence

Despite its importance in both industrial and service robotics, mobile manipulation remains a significant challenge as it requires a seamless integration of end-effector trajectory generation with navigation skills as well as reasoning over long-horizons. Existing methods struggle to control the large configuration space, and to navigate dynamic and unknown environments. In previous work, we proposed to decompose mobile manipulation tasks into a simplified motion generator for the end-effector in task space and a trained reinforcement learning agent for the mobile base to account for kinematic feasibility of the motion. In this work, we introduce Neural Navigation for Mobile Manipulation (N$^2$M$^2$) which extends this decomposition to complex obstacle environments and enables it to tackle a broad range of tasks in real world settings. The resulting approach can perform unseen, long-horizon tasks in unexplored environments while instantly reacting to dynamic obstacles and environmental changes. At the same time, it provides a simple way to define new mobile manipulation tasks. We demonstrate the capabilities of our proposed approach in extensive simulation and real-world experiments on multiple kinematically diverse mobile manipulators. Code and videos are publicly available at http://mobile-rl.cs.uni-freiburg.de.


Is There a Future for Laundry-Folding Robots?

IEEE Spectrum Robotics

The promising thing about laundry-folding robots is that they target a job that everybody does frequently, and nobody really likes. But to be successful in robotics, especially in consumer robotics, you have to be both affordable and reliable, and robots are, still, generally awful at those things. Laundroid, a robotic system that could ingest wads of laundry and somehow spit out neatly folded clothes, put on a few demos at CES over the past few years, but the Japanese company behind it just announced bankruptcy--probably because the robot didn't work all the time, and would likely have been absurdly expensive. Laundroid may not have been a success, but does that mean that other laundry-folding robots, most notably Foldimate, are doomed as well? The original Laundroid concept was to combine washing clothes, drying clothes, ironing clothes, and folding clothes into one single (magical?)


Home Robot Control for People With Disabilities

IEEE Spectrum Robotics

Robots offer an opportunity to enable people to live safely and comfortably in their homes as they grow older. In the near future (we're all hoping), robots will be able to help us by cooking, cleaning, doing chores, and generally taking care of us, but they're not yet at the point where they can do those sorts of things autonomously. Putting a human in the loop can help robots be useful more quickly, which is especially important for the people who would benefit the most from this technology--specifically, folks with disabilities that make them more reliant on care. Ideally, the people who need things done would be the people in the loop telling the robot what to do, but that can be particularly challenging for those with disabilities that limit how mobile they are. If you can't move your arms or hands, for example, how are you going to control a robot?


Robot Learns to Sort and Organize After Watching a Human Do It Only Once

#artificialintelligence

Having a robotic butler hand you a steaming cup of coffee and the newspaper in the morning is something science fiction has made us yearn for and modern robotics has brought into the realm of possibility. Yet roboticists are still having trouble teaching machines how to complete tasks that even children are capable of. That's why two researchers at the University of California, Berkeley have begun teaching a robot as if it were a five-year-old in the hopes of turning them into the taskmaster robots of the silver screen. "We're teaching robots how to pull off sorting and organizational tasks by simply watching a human do them once," Tianhe Yu co-author of the study tells Inverse. "Today's robots are able to perform a few specific tasks well, but they still don't come close to what a human is capable of. We hope that by teaching robots through demonstration we can enable them to carry out more general tasks."


The Importance of Teaching Robots to Hug

IEEE Spectrum Robotics

Hugs make us feel warm and safe and comforted and loved. If we need a hug and another human isn't available, we can sometimes get a little bit of satisfaction from hugging inanimate objects like stuffed animals, but it seems like robots (that can hypothetically hug us back) might be able to be somewhat more fulfilling. While we've seen robots that are actively huggable before, and even a few that can hug you back, it's not clear exactly how a robot hug compares to a human hug, and whether hugging a robot can confer any of the benefits that we get from hugging people. At the ACM/IEEE International Conference on Human Robot Interaction (HRI) earlier this year, Alexis E. Block and Katherine J. Kuchenbecker from the Haptic Intelligence Department at the Max Planck Institute for Intelligent Systems in Stuttgart, Germany, presented a paper on "Emotionally Supporting Humans Through Robot Hugs." Their work explores how robots can be more effectively designed and taught to give the kinds of hugs that humans will love.