Bimanual 3D Hand Motion and Articulation Forecasting in Everyday Images
Prakash, Aditya, Forsyth, David, Gupta, Saurabh
–arXiv.org Artificial Intelligence
We tackle the problem of forecasting bimanual 3D hand motion & articulation from a single image in everyday settings. To address the lack of 3D hand annotations in diverse settings, we design an annotation pipeline consisting of a diffusion model to lift 2D hand keypoint sequences to 4D hand motion. For the forecasting model, we adopt a diffusion loss to account for the multimodality in hand motion distribution. Extensive experiments across 6 datasets show the benefits of training on diverse data with imputed labels (14% improvement) and effectiveness of our lifting (42% better) & forecasting (16.4% gain) models, over the best baselines, especially in zero-shot generalization to everyday images.
arXiv.org Artificial Intelligence
Oct-8-2025
- Country:
- Asia > Japan
- Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
- North America > United States
- Illinois > Champaign County > Urbana (0.40)
- Asia > Japan
- Genre:
- Research Report (0.40)
- Technology: