Human-Centric Foundation Models: Perception, Generation and Agentic Modeling

Tang, Shixiang, Wang, Yizhou, Chen, Lu, Wang, Yuan, Peng, Sida, Xu, Dan, Ouyang, Wanli

arXiv.org Artificial Intelligence 

In this survey, we present community appeals for a unified framework [Ci et al., 2023; a comprehensive overview of HcFMs by proposing Wang et al., 2023; Chen et al., 2024; Huang et al., 2024a] to a taxonomy that categorizes current approaches unlock systematic understanding and a wide range of humancentric into four groups: (1) Human-centric Perception applications for everybody. Foundation Models that capture fine-grained features Inspired by rapid advancements of general foundation models, for multi-modal 2D and 3D understanding; (2) e.g., large language models (LLMs), large vision models Human-centric AIGC Foundation Models that generate (LVMs) and text-to-image generative models, and their high-fidelity, diverse human-related content; presents of a paradigm shift from end-to-end learning of (3) Unified Perception and Generation Models that task-specific models to generalist models, a recent trend is integrate these capabilities to enhance both human to develop Human-centric Foundation Models (HcFM) that understanding and synthesis; and (4) Human-centric satisfy three criteria, namely generalization, broad applicability, Agentic Foundation Models that extend beyond perception and high fidelity. Generalization ensures robustness and generation to learn human-like intelligence to unseen conditions, enabling the model to perform consistently and interactive behaviors for humanoid embodied across varied environments.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found