Goto

Collaborating Authors

 proprioception


Underactuated Robotic Hand with Grasp State Estimation Using Tendon-Based Proprioception

Lee, Jae-Hyun, Park, Jonghoo, Cho, Kyu-Jin

arXiv.org Artificial Intelligence

Abstract--Anthropomorphic underactuated hands are valued for their structural simplicity and inherent adaptability. However, the uncertainty arising from interdependent joint motions makes it challenging to capture various grasp states during hand-object interaction without increasing structural complexity through multiple embedded sensors. This motivates the need for an approach that can extract rich grasp-state information from a single sensing source while preserving the simplicity of underactuation. This study proposes an anthropomorphic underactuated hand that achieves comprehensive grasp state estimation, using only tendon-based proprioception provided by series elastic actuators (SEAs). Our approach is enabled by the design of a compact SEA with high accuracy and reliability that can be seamlessly integrated into sensorless fingers. By coupling accurate proprioceptive measurements with potential energy-based modeling, the system estimates multiple key grasp state variables, including contact timing, joint angles, relative object stiffness, and external disturbances. Finger-level experimental validations and extensive hand-level grasp functionality demonstrations confirmed the effectiveness of the proposed approach. NTHROPOMORPHIC robotic hands have been widely adopted to replicate the functionality of the human hand. Among various actuation strategies, underactuated hands are extensively employed due to their structural simplicity and adaptability to diverse object geometries [1], [2].


Beyond Egocentric Limits: Multi-View Depth-Based Learning for Robust Quadrupedal Locomotion

Rahem, Rémy, Suleiman, Wael

arXiv.org Artificial Intelligence

Recent progress in legged locomotion has allowed highly dynamic and parkour-like behaviors for robots, similar to their biological counterparts. Yet, these methods mostly rely on egocentric (first-person) perception, limiting their performance, especially when the viewpoint of the robot is occluded. A promising solution would be to enhance the robot's environmental awareness by using complementary viewpoints, such as multiple actors exchanging perceptual information. Inspired by this idea, this work proposes a multi-view depth-based locomotion framework that combines egocentric and exocentric observations to provide richer environmental context during agile locomotion. Using a teacher-student distillation approach, the student policy learns to fuse proprioception with dual depth streams while remaining robust to real-world sensing imperfections. To further improve robustness, we introduce extensive domain randomization, including stochastic remote-camera dropouts and 3D positional perturbations that emulate aerial-ground cooperative sensing. Simulation results show that multi-viewpoints policies outperform single-viewpoint baseline in gap crossing, step descent, and other dynamic maneuvers, while maintaining stability when the exocentric camera is partially or completely unavailable. Additional experiments show that moderate viewpoint misalignment is well tolerated when incorporated during training. This study demonstrates that heterogeneous visual feedback improves robustness and agility in quadrupedal locomotion. Furthermore, to support reproducibility, the implementation accompanying this work is publicly available at https://anonymous.4open.science/r/multiview-parkour-6FB8


Semantic Glitch: Agency and Artistry in an Autonomous Pixel Cloud

Zhang, Qing, Huang, Jing, Xu, Mingyang, Rekimoto, Jun

arXiv.org Artificial Intelligence

While mainstream robotics pursues metric precision and flawless performance, this paper explores the creative potential of a deliberately "lo-fi" approach. We present the "Semantic Glitch," a soft flying robotic art installation whose physical form, a 3D pixel style cloud, is a "physical glitch" derived from digital archaeology. We detail a novel autonomous pipeline that rejects conventional sensors like LiDAR and SLAM, relying solely on the qualitative, semantic understanding of a Multimodal Large Language Model to navigate. By authoring a bio-inspired personality for the robot through a natural language prompt, we create a "narrative mind" that complements the "weak," historically, loaded body. Our analysis begins with a 13-minute autonomous flight log, and a follow-up study statistically validates the framework's robustness for authoring quantifiably distinct personas. The combined analysis reveals emergent behaviors, from landmark-based navigation to a compelling "plan to execution" gap, and a character whose unpredictable, plausible behavior stems from a lack of precise proprioception. This demonstrates a lo-fi framework for creating imperfect companions whose success is measured in character over efficiency.


Tailored robotic training improves hand function and proprioceptive processing in stroke survivors with proprioceptive deficits: A randomized controlled trial

Farrens, Andria J., Garcia-Fernandez, Luis, Rojas, Raymond Diaz, Estrada, Jillian Obeso, Reinsdorf, Dylan, Chan, Vicky, Gupta, Disha, Perry, Joel, Wolbrecht, Eric, Do, An, Cramer, Steven C., Reinkensmeyer, David J.

arXiv.org Artificial Intelligence

Precision rehabilitation aims to tailor movement training to improve outcomes. We tested whether proprioceptively-tailored robotic training improves hand function and neural processing in stroke survivors. Using a robotic finger exoskeleton, we tested two proprioceptively-tailored approaches: Propriopixel Training, which uses robot-facilitated, gamified movements to enhance proprioceptive processing, and Virtual Assistance Training, which reduces robotic aid to increase reliance on self-generated feedback. In a randomized controlled trial, forty-six chronic stroke survivors completed nine 2-hour sessions of Standard, Propriopixel or Virtual training. Among participants with proprioceptive deficits, Propriopixel ((Box and Block Test: 7 +/- 4.2, p=0.002) and Virtual Assistance (4.5 +/- 4.4 , p=0.068) yielded greater gains in hand function (Standard: 0.8 +/- 2.3 blocks). Proprioceptive gains correlated with improvements in hand function. Tailored training enhanced neural sensitivity to proprioceptive cues, evidenced by a novel EEG biomarker, the proprioceptive Contingent Negative Variation. These findings support proprioceptively-tailored training as a pathway to precision neurorehabilitation.


An Enhanced Proprioceptive Method for Soft Robots Integrating Bend Sensors and IMUs

Han, Dong Heon, Mehta, Mayank, Zuo, Runze, Wanger, Zachary, Bruder, Daniel

arXiv.org Artificial Intelligence

Abstract--This study presents an enhanced proprioceptive method for accurate shape estimation of soft robots using only off-the-shelf sensors, ensuring cost-effectiveness and easy applicability. By integrating inertial measurement units (IMUs) with complementary bend sensors, IMU drift is mitigated, enabling reliable long-term proprioception. A piecewise constant curvature model estimates the tip location from the fused orientation data and reconstructs the robot's deformation. Experiments under no loading, external forces, and passive obstacle interactions during 45 minutes of continuous operation showed a root mean square error of 16.96 mm (2.91% of total length), a 56% reduction compared to IMUonly benchmarks. These results demonstrate that our approach not only enables long-duration proprioception in soft robots but also maintains high accuracy and robustness across these diverse conditions. Soft robots possess intrinsic compliance and virtually infinite degrees of freedom, enabling continuous deformation [1].


Curiosity-Driven Co-Development of Action and Language in Robots Through Self-Exploration

Tinker, Theodore Jerome, Doya, Kenji, Tani, Jun

arXiv.org Machine Learning

A central question in both cognitive science and artificial intelligence is how humans and artificial systems can acquire competencies for language and motor command in a co-developmental manner, despite having access to only limited learning experiences. This question is exemplified in human infants, who achieve remarkable generalization with sparse input. This is a stark contrast to large-scale models which rely on massive training corpora, to reach similar capabilities. This raises the issue of what mechanisms enable such efficient developmental learning. From the perspective of developmental psychology, infants acquire language through rich interaction with their embodied environments. T omasello's "verb-island" hypothesis argues that children initially learn verbs in specific, isolated contexts before generalizing across broader linguistic structures (1). He also emphasized the importance of embodiment in language acquisition, suggesting that grounding linguistic symbols in sensorimotor experiences is fundamental to language learning (2).


KiVi: Kinesthetic-Visuospatial Integration for Dynamic and Safe Egocentric Legged Locomotion

Li, Peizhuo, Li, Hongyi, Ma, Yuxuan, Chang, Linnan, Yang, Xinrong, Yu, Ruiqi, Zhang, Yifeng, Cao, Yuhong, Zhu, Qiuguo, Sartoretti, Guillaume

arXiv.org Artificial Intelligence

Abstract-- Vision-based locomotion has shown great promise in enabling legged robots to perceive and adapt to complex environments. However, visual information is inherently fragile, being vulnerable to occlusions, reflections, and lighting changes, which often cause instability in locomotion. Inspired by animal sensorimotor integration, we propose KiVi, a Ki nesthetic-Vi suospatial integration framework, where kinesthetics encodes proprioceptive sensing of body motion and visuospatial reasoning captures visual perception of surrounding terrain. This modality-balanced, yet integrative design, combined with memory-enhanced attention, allows the robot to robustly interpret visual cues while maintaining fallback stability through proprioception. Extensive experiments show that our method enables quadruped robots to stably traverse diverse terrains and operate reliably in unstructured outdoor environments, remaining robust to out-of-distribution (OOD) visual noise and occlusion unseen during training, thereby highlighting its effectiveness and applicability to real-world legged locomotion. Kinesthetic sense and visuospatial perception constitute two fundamental modalities that allow legged animals to achieve effective locomotion.


QuadKAN: KAN-Enhanced Quadruped Motion Control via End-to-End Reinforcement Learning

Wang, Yinuo, Tao, Gavin

arXiv.org Artificial Intelligence

Legged robots offer mobility where wheeled platforms fail, such as stairs, rubble, soft substrates, and cluttered indoor-outdoor settings, enabling applications in inspection, search and rescue, agriculture, and planetary exploration [1]. Robust locomotion control is therefore a foundational capability for practical quadrupedal systems, underpinning safe navigation and dependable operation across diverse terrains and disturbances [2]. Deep reinforcement learning (DRL) has emerged as a compelling paradigm for such control because it optimizes closed-loop policies through interaction and can produce adaptive behaviors [3]. A substantial body of prior work has focused on training blind controllers that rely exclusively on proprioceptive inputs such as inertial measurement units (IMUs) and joint feedback [4]. While these blind policies can traverse uneven and unknown terrains through large-scale simulation and domain randomization, they inherently lack foresight: without exteroceptive input, they respond only upon contact and struggle to proactively avoid obstacles or plan foot placement on irregular ground. Vision complements proprioception by providing anticipatory geometric information, enabling early detection of distant obstacles and terrain changes [5]. As a result, cross-modal policies that integrate proprioception with depth imaging have gained prominence, facilitating safer and more efficient locomotion through earlier trajectory adjustments. Most existing cross-modal pipelines adopt multilayer perceptrons (MLPs) for the proprioceptive encoder and for the decision head that fuses proprioception with vision.


Towards Biosignals-Free Autonomous Prosthetic Hand Control via Imitation Learning

Shi, Kaijie, Lu, Wanglong, Zhao, Hanli, da Fonseca, Vinicius Prado, Zou, Ting, Jiang, Xianta

arXiv.org Artificial Intelligence

-- Limb loss affects millions globally, impairing physical function and reducing quality of life. Most traditional surface electromyographic (sEMG) and semi-autonomous methods require users to generate myoelec-tric signals for each control, imposing physically and mentally taxing demands. This study aims to develop a fully autonomous control system that enables a prosthetic hand to automatically grasp and release objects of various shapes using only a camera attached to the wrist. By placing the hand near an object, the system will automatically execute grasping actions with a proper grip force in response to the hand's movements and the environment. To release the object being grasped, just naturally place the object close to the table and the system will automatically open the hand. Such a system would provide individuals with limb loss with a very easy-to-use prosthetic control interface and greatly reduce mental effort while using. To achieve this goal, we developed a teleoperation system to collect human demonstration data for training the prosthetic hand control model using imitation learning, which mimics the prosthetic hand actions from human. Through training the model using only a few objects' data from one single participant, we have shown that the imitation learning algorithm can achieve high success rates, generalizing to more individuals and unseen This work has been submitted to the IEEE for possible publication. This work was supported in part by the Government of Canada's New Frontiers in Research Fund (NFRF, Grant No NFRFE-2022-00407) and Natural Sciences and Engineering Research Council of Canada's Research T ools and Instruments (NSERC RTI, Grant No RTI-2022-00688). This work involved human subjects or animals in its research. Approval of all ethical and experimental procedures and protocols was granted by the Memorial University Interdisciplinary Committee on Ethics in Human Research (20210316-SC). Kaijie Shi, Wanglong Lu are with Department of Computer Science, Memorial University of Newfoundland, St. John's, NL A1B 3X5, Canada, and also with College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325000, China. Hanli Zhao is with College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325000, China. Vinicius Prado da Fonseca is with Department of Computer Science, Memorial University of Newfoundland, St. John's, NL A1B 3X5, Canada. Ting Zou is with Department of Mechanical and Mechatronics Engineering, Memorial University of Newfoundland, St. John's, NL A1B 3X5, Canada.


RAPID Hand: A Robust, Affordable, Perception-Integrated, Dexterous Manipulation Platform for Generalist Robot Autonomy

Wan, Zhaoliang, Bi, Zetong, Zhou, Zida, Ren, Hao, Zeng, Yiming, Li, Yihan, Qi, Lu, Yang, Xu, Yang, Ming-Hsuan, Cheng, Hui

arXiv.org Artificial Intelligence

This paper addresses the scarcity of low-cost but high-dexterity platforms for collecting real-world multi-fingered robot manipulation data towards generalist robot autonomy. To achieve it, we propose the RAPID Hand, a co-optimized hardware and software platform where the compact 20-DoF hand, robust whole-hand perception, and high-DoF teleoperation interface are jointly designed. Specifically, RAPID Hand adopts a compact and practical hand ontology and a hardware-level perception framework that stably integrates wrist-mounted vision, fingertip tactile sensing, and proprioception with sub-7 ms latency and spatial alignment. Collecting high-quality demonstrations on high-DoF hands is challenging, as existing teleoperation methods struggle with precision and stability on complex multi-fingered systems. We address this by co-optimizing hand design, perception integration, and teleoperation interface through a universal actuation scheme, custom perception electronics, and two retargeting constraints. We evaluate the platform's hardware, perception, and teleoperation interface. Training a diffusion policy on collected data shows superior performance over prior works, validating the system's capability for reliable, high-quality data collection. The platform is constructed from low-cost and off-the-shelf components and will be made public to ensure reproducibility and ease of adoption.