Goto

Collaborating Authors

 aabb


Vision-Guided Grasp Planning for Prosthetic Hands in Unstructured Environments

arXiv.org Artificial Intelligence

Recent advancements in prosthetic technology have increasingly focused on enhancing dexterity and autonomy through intelligent control systems. Vision-based approaches offer promising results for enabling prosthetic hands to interact more naturally with diverse objects in dynamic environments. Building on this foundation, the paper presents a vision-guided grasping algorithm for a prosthetic hand, integrating perception, planning, and control for dexterous manipulation. A camera mounted on the set up captures the scene, and a Bounding Volume Hierarchy (BVH)-based vision algorithm is employed to segment an object for grasping and define its bounding box. Grasp contact points are then computed by generating candidate trajectories using Rapidly-exploring Random Tree Star algorithm, and selecting fingertip end poses based on the minimum Euclidean distance between these trajectories and the objects point cloud. Each finger grasp pose is determined independently, enabling adaptive, object-specific configurations. Damped Least Square (DLS) based Inverse kinematics solver is used to compute the corresponding joint angles, which are subsequently transmitted to the finger actuators for execution. This modular pipeline enables per-finger grasp planning and supports real-time adaptability in unstructured environments. The proposed method is validated in simulation, and experimental integration on a Linker Hand O7 platform.


RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

arXiv.org Artificial Intelligence

We present RoboGen, a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation. RoboGen leverages the latest advancements in foundation and generative models. Instead of directly using or adapting these models to produce policies or low-level actions, we advocate for a generative scheme, which uses these models to automatically generate diversified tasks, scenes, and training supervisions, thereby scaling up robotic skill learning with minimal human supervision. Our approach equips a robotic agent with a self-guided propose-generate-learn cycle: the agent first proposes interesting tasks and skills to develop, and then generates corresponding simulation environments by populating pertinent objects and assets with proper spatial configurations. Afterwards, the agent decomposes the proposed high-level task into sub-tasks, selects the optimal learning approach (reinforcement learning, motion planning, or trajectory optimization), generates required training supervision, and then learns policies to acquire the proposed skill. Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics. Our fully generative pipeline can be queried repeatedly, producing an endless stream of skill demonstrations associated with diverse tasks and environments.


Modular Procedural Generation for Voxel Maps

arXiv.org Artificial Intelligence

Task environments developed in Minecraft are becoming increasingly popular for artificial intelligence (AI) research. However, most of these are currently constructed manually, thus failing to take advantage of procedural content generation (PCG), a capability unique to virtual task environments. In this paper, we present mcg, an open-source library to facilitate implementing PCG algorithms for voxel-based environments such as Minecraft. The library is designed with human-machine teaming research in mind, and thus takes a 'top-down' approach to generation, simultaneously generating low and high level machine-readable representations that are suitable for empirical research. These can be consumed by downstream AI applications that consider human spatial cognition. The benefits of this approach include rapid, scalable, and efficient development of virtual environments, the ability to control the statistics of the environment at a semantic level, and the ability to generate novel environments in response to player actions in real time.