A generic diffusion-based approach for 3D human pose prediction in the wild

Saadatnejad, Saeed, Rasekh, Ali, Mofayezi, Mohammadreza, Medghalchi, Yasamin, Rajabzadeh, Sara, Mordan, Taylor, Alahi, Alexandre

Mar-15-2023–arXiv.org Artificial Intelligence

Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs. The code is available online: \url{https://github.com/vita-epfl/DePOSit}.

artificial intelligence, machine learning, prediction, (14 more...)

arXiv.org Artificial Intelligence

Mar-15-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.46)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (1.00)
  - Representation & Reasoning (0.93)
  - Robots > Humanoid Robots (0.71)
  - Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found