Goto

Collaborating Authors

 joint angle



Training-Free Robot Pose Estimation using Off-the-Shelf Foundational Models

Liang, Laurence

arXiv.org Artificial Intelligence

Pose estimation of a robot arm from visual inputs is a challenging task. However, with the increasing adoption of robot arms for both industrial and residential use cases, reliable joint angle estimation can offer improved safety and performance guarantees, and also be used as a verifier to further train robot policies. This paper introduces using frontier vision-language models (VLMs) as an ``off-the-shelf" tool to estimate a robot arm's joint angles from a single target image. By evaluating frontier VLMs on both synthetic and real-world image-data pairs, this paper establishes a performance baseline attained by current FLMs. In addition, this paper presents empirical results suggesting that test time scaling or parameter scaling alone does not lead to improved joint angle predictions.


REWW-ARM -- Remote Wire-Driven Mobile Robot: Design, Control, and Experimental Validation

Hattori, Takahiro, Kawaharazuka, Kento, Suzuki, Temma, Yoneda, Keita, Okada, Kei

arXiv.org Artificial Intelligence

Electronic devices are essential for robots but limit their usable environments. To overcome this, methods excluding electronics from the operating environment while retaining advanced electronic control and actuation have been explored. These include the remote hydraulic drive of electronics-free mobile robots, which offer high reachability, and long wire-driven robot arms with motors consolidated at the base, which offer high environmental resistance. To combine the advantages of both, this study proposes a new system, "Remote Wire Drive." As a proof-of-concept, we designed and developed the Remote Wire-Driven robot "REWW-ARM", which consists of the following components: 1) a novel power transmission mechanism, the "Remote Wire Transmission Mechanism" (RWTM), the key technology of the Remote Wire Drive; 2) an electronics-free distal mobile robot driven by it; and 3) a motor-unit that generates power and provides electronic closed-loop control based on state estimation via the RWTM. In this study, we evaluated the mechanical and control performance of REWW-ARM through several experiments, demonstrating its capability for locomotion, posture control, and object manipulation both on land and underwater. This suggests the potential for applying the Remote Wire-Driven system to various types of robots, thereby expanding their operational range.


MIMIC-MJX: Neuromechanical Emulation of Animal Behavior

Zhang, Charles Y., Yang, Yuanjia, Sirbu, Aidan, Abe, Elliott T. T., Wärnberg, Emil, Leonardis, Eric J., Aldarondo, Diego E., Lee, Adam, Prasad, Aaditya, Foat, Jason, Bian, Kaiwen, Park, Joshua, Bhatt, Rusham, Saunders, Hutton, Nagamori, Akira, Thanawalla, Ayesha R., Huang, Kee Wui, Plum, Fabian, Beck, Hendrik K., Flavell, Steven W., Labonte, David, Richards, Blake A., Brunton, Bingni W., Azim, Eiman, Ölveczky, Bence P., Pereira, Talmo D.

arXiv.org Artificial Intelligence

The primary output of the nervous system is movement and behavior. While recent advances have democratized pose tracking during complex behavior, kinematic trajectories alone provide only indirect access to the underlying control processes. Here we present MIMIC-MJX, a framework for learning biologically-plausible neural control policies from kinematics. MIMIC-MJX models the generative process of motor control by training neural controllers that learn to actuate biomechanically-realistic body models in physics simulation to reproduce real kinematic trajectories. We demonstrate that our implementation is accurate, fast, data-efficient, and generalizable to diverse animal body models. Policies trained with MIMIC-MJX can be utilized to both analyze neural control strategies and simulate behavioral experiments, illustrating its potential as an integrative modeling framework for neuroscience.


Underactuated Robotic Hand with Grasp State Estimation Using Tendon-Based Proprioception

Lee, Jae-Hyun, Park, Jonghoo, Cho, Kyu-Jin

arXiv.org Artificial Intelligence

Abstract--Anthropomorphic underactuated hands are valued for their structural simplicity and inherent adaptability. However, the uncertainty arising from interdependent joint motions makes it challenging to capture various grasp states during hand-object interaction without increasing structural complexity through multiple embedded sensors. This motivates the need for an approach that can extract rich grasp-state information from a single sensing source while preserving the simplicity of underactuation. This study proposes an anthropomorphic underactuated hand that achieves comprehensive grasp state estimation, using only tendon-based proprioception provided by series elastic actuators (SEAs). Our approach is enabled by the design of a compact SEA with high accuracy and reliability that can be seamlessly integrated into sensorless fingers. By coupling accurate proprioceptive measurements with potential energy-based modeling, the system estimates multiple key grasp state variables, including contact timing, joint angles, relative object stiffness, and external disturbances. Finger-level experimental validations and extensive hand-level grasp functionality demonstrations confirmed the effectiveness of the proposed approach. NTHROPOMORPHIC robotic hands have been widely adopted to replicate the functionality of the human hand. Among various actuation strategies, underactuated hands are extensively employed due to their structural simplicity and adaptability to diverse object geometries [1], [2].


Massively Parallel Imitation Learning of Mouse Forelimb Musculoskeletal Reaching Dynamics

Leonardis, Eric, Nagamori, Akira, Thanawalla, Ayesha, Yang, Yuanjia, Park, Joshua, Saunders, Hutton, Azim, Eiman, Pereira, Talmo

arXiv.org Artificial Intelligence

The brain has evolved to effectively control the body, and in order to understand the relationship we need to model the sensorimotor transformations underlying embodied control. As part of a coordinated effort, we are developing a general-purpose platform for behavior-driven simulation modeling high fidelity behavioral dynamics, biomechanics, and neural circuit architectures underlying embodied control. We present a pipeline for taking kinematics data from the neuroscience lab and creating a pipeline for recapitulating those natural movements in a biomechanical model. We implement a imitation learning framework to perform a dexterous forelimb reaching task with a musculoskeletal model in a simulated physics environment. The mouse arm model is currently training at faster than 1 million training steps per second due to GPU acceleration with JAX and Mujoco-MJX. We present results that indicate that adding naturalistic constraints on energy and velocity lead to simulated musculoskeletal activity that better predict real EMG signals. This work provides evidence to suggest that energy and control constraints are critical to modeling musculoskeletal motor control.


limited space, we couldn't answer all of the reviewers ' clarification queries but promise to include in the final version

Neural Information Processing Systems

We thank the reviewers for their feedback. The reviewers R1 and R3 suggested additional experiments. We report those results and address other concerns below. Supplementary Figure 1, the monolithic baseline works until 4 limbs (i.e., 12 DOF), but fails to scale beyond that. Hence, each limb directly only experiences the torque it exerts on itself.




RoboTAG: End-to-end Robot Configuration Estimation via Topological Alignment Graph

Liu, Yifan, Zhan, Fangneng, Li, Wanhua, Sun, Haowen, Fragkiadaki, Katerina, Pfister, Hanspeter

arXiv.org Artificial Intelligence

Estimating robot pose from a monocular RGB image is a challenge in robotics and computer vision. Existing methods typically build networks on top of 2D visual backbones and depend heavily on labeled data for training, which is often scarce in real-world scenarios, causing a sim-to-real gap. Moreover, these approaches reduce the 3D-based problem to 2D domain, neglecting the 3D priors. T o address these, we propose Robot T opological Alignment Graph (RoboTAG), which incorporates a 3D branch to inject 3D priors while enabling co-evolution of the 2D and 3D representations, alleviating the reliance on labels. Specifically, the RoboTAG consists of a 3D branch and a 2D branch, where nodes represent the states of the camera and robot system, and edges capture the dependencies between these variables or denote alignments between them. Closed loops are then defined in the graph, on which a consistency supervision across branches can be applied. This design allows us to utilize in-the-wild images as training data without annotations. Experimental results demonstrate that our method is effective across robot types, highlighting its potential to alleviate the data bottleneck in robotics.