Goto

Collaborating Authors

 env



HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning

Jing, Zhi, Yang, Siyuan, Ao, Jicong, Xiao, Ting, Jiang, Yu-Gang, Bai, Chenjia

arXiv.org Artificial Intelligence

For robotic manipulation, existing robotics datasets and simulation benchmarks predominantly cater to robot-arm platforms. However, for humanoid robots equipped with dual arms and dexterous hands, simulation tasks and high-quality demonstrations are notably lacking. Bimanual dexterous manipulation is inherently more complex, as it requires coordinated arm movements and hand operations, making autonomous data collection challenging. This paper presents HumanoidGen, an automated task creation and demonstration collection framework that leverages atomic dexterous operations and LLM reasoning to generate relational constraints. Specifically, we provide spatial annotations for both assets and dexterous hands based on the atomic operations, and perform an LLM planner to generate a chain of actionable spatial constraints for arm movements based on object affordances and scenes. To further improve planning ability, we employ a variant of Monte Carlo tree search to enhance LLM reasoning for long-horizon tasks and insufficient annotation. In experiments, we create a novel benchmark with augmented scenarios to evaluate the quality of the collected data. The results show that the performance of the 2D and 3D diffusion policies can scale with the generated dataset. Project page is https://openhumanoidgen.github.io.






Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

Neural Information Processing Systems

The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is O (1 / m) + O (ϵ), with m being the width of the neural network and ϵ the approximation quality of the best critic neural network over the projected set.


report the final policy performance (mean std) over the seeds. Due to space constraints, we omit the learning curves

Neural Information Processing Systems

We thank all the reviewers for their constructive feedback on improving the paper. Q. Are exploration and credit assignment (due to delayed rewards) the same? We agree that it's important to clarify this distinction and We'll include this in the revision. Q. Unintended output in provided Q. IRCR if there are indeed dense rewards? We have added a distributional variant of SAC (EXP .


Distribution-Based Feature Attribution for Explaining the Predictions of Any Classifier

Li, Xinpeng, Ting, Kai Ming

arXiv.org Artificial Intelligence

The proliferation of complex, black-box AI models has intensified the need for techniques that can explain their decisions. Feature attribution methods have become a popular solution for providing post-hoc explanations, yet the field has historically lacked a formal problem definition. This paper addresses this gap by introducing a formal definition for the problem of feature attribution, which stipulates that explanations be supported by an underlying probability distribution represented by the given dataset. Our analysis reveals that many existing model-agnostic methods fail to meet this criterion, while even those that do often possess other limitations. To overcome these challenges, we propose Distributional Feature Attribution eXplanations (DFAX), a novel, model-agnostic method for feature attribution. DFAX is the first feature attribution method to explain classifier predictions directly based on the data distribution. We show through extensive experiments that DFAX is more effective and efficient than state-of-the-art baselines.


GenDexHand: Generative Simulation for Dexterous Hands

Chen, Feng, Xu, Zhuxiu, Chu, Tianzhe, Zhou, Xunzhe, Sun, Li, Wu, Zewen, Gao, Shenghua, Li, Zhongyu, Yang, Yanchao, Ma, Yi

arXiv.org Artificial Intelligence

Data scarcity remains a fundamental bottleneck for embodied intelligence. Existing approaches use large language models (LLMs) to automate gripper-based simulation generation, but they transfer poorly to dexterous manipulation, which demands more specialized environment design. Meanwhile, dexterous manipulation tasks are inherently more difficult due to their higher degrees of freedom. Massively generating feasible and trainable dexterous hand tasks remains an open challenge. To this end, we present GenDexHand, a generative simulation pipeline that autonomously produces diverse robotic tasks and environments for dexterous manipulation. GenDexHand introduces a closed-loop refinement process that adjusts object placements and scales based on vision-language model (VLM) feedback, substantially improving the average quality of generated environments. Each task is further decomposed into sub-tasks to enable sequential reinforcement learning, reducing training time and increasing success rates. Our work provides a viable path toward scalable training of diverse dexterous hand behaviors in embodied intelligence by offering a simulation-based solution to synthetic data generation. Our website: https://winniechen2002.github.io/GenDexHand/.