Learning to Design Soft Hands using Reward Models

Bai, Xueqian, Hansen, Nicklas, Singh, Adabhav, Tolley, Michael T., Duan, Yan, Abbeel, Pieter, Wang, Xiaolong, Yi, Sha

Oct-21-2025–arXiv.org Artificial Intelligence

Amazon FAR (Frontier AI & Robotics)Figure 1: We present a Cross-Entropy Method (CEM) with reward model (CEM-RM) framework that optimizes block-wise, finger-wise, and tendon-routing design distributions of a soft robotic hand using pre-collected teleoperation data. Hardware experiments demonstrate that CEM-RM achieves effective design optimization with significantly fewer samples than pure optimization, enabling robust grasping of challenging objects. Abstract-- Soft robotic hands promise to provide compliant and safe interaction with objects and environments. However, designing soft hands to be both compliant and functional across diverse use cases remains challenging. Although co-design of hardware and control better couples morphology to behavior [1], the resulting search space is high-dimensional, and even simulation-based evaluation is computationally expensive. In this paper, we propose a Cross-Entropy Method with Reward Model (CEM-RM) framework that efficiently optimizes tendon-driven soft robotic hands based on teleoperation control policy, reducing design evaluations by more than half compared to pure optimization while learning a distribution of optimized hand designs from pre-collected teleoperation data. We derive a design space for a soft robotic hand composed of flexural soft fingers and implement parallelized training in simulation.

machine learning, reinforcement learning, reward model, (14 more...)

arXiv.org Artificial Intelligence

Oct-21-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > California > San Diego County > San Diego (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.94)
  - Representation & Reasoning > Optimization (0.93)
  - Robots > Manipulation (1.00)