Look Ma, No Hands! Agent-Environment Factorization of Egocentric Videos
–Neural Information Processing Systems
The analysis and use of egocentric videos for robotics tasks is made challenging by occlusion and the visual mismatch between the human hand and a robot end-effector. Past work views the human hand as a nuisance and removes it from the scene. However, the hand also provides a valuable signal for learning. In this work, we propose to extract a factored representation of the scene that separates the agent (human hand) and the environment.
Neural Information Processing Systems
Dec-24-2025, 22:08:16 GMT
- Technology: