Challenges and Trends in Egocentric Vision: A Survey
Li, Xiang, Qiu, Heqian, Wang, Lanxiao, Zhang, Hanwen, Qi, Chenghao, Han, Linfeng, Xiong, Huiyu, Li, Hongliang
–arXiv.org Artificial Intelligence
With the rapid development of artificial intelligence technologies and wearable devices, egocentric vision understanding has emerged as a new and challenging research direction, gradually attracting widespread attention from both academia and industry. Egocentric vision captures visual and multimodal data through cameras or sensors worn on the human body, offering a unique perspective that simulates human visual experiences. This paper provides a comprehensive survey of the research on egocentric vision understanding, systematically analyzing the components of egocentric scenes and categorizing the tasks into four main areas: subject understanding, object understanding, environment understanding, and hybrid understanding. We explore in detail the sub-tasks within each category. We also summarize the main challenges and trends currently existing in the field. Furthermore, this paper presents an overview of high-quality egocentric vision datasets, offering valuable resources for future research. By summarizing the latest advancements, we anticipate the broad applications of egocentric vision technologies in fields such as augmented reality, virtual reality, and embodied intelligence, and propose future research directions based on the latest developments in the field.
arXiv.org Artificial Intelligence
Mar-19-2025
- Industry:
- Education > Educational Setting
- Online (0.45)
- Health & Medicine
- Consumer Health (0.67)
- Therapeutic Area (0.67)
- Information Technology (1.00)
- Leisure & Entertainment (1.00)
- Media > Film (0.67)
- Education > Educational Setting
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Statistical Learning (1.00)
- Natural Language > Large Language Model (0.93)
- Representation & Reasoning > Agents (0.92)
- Robots (1.00)
- Vision
- Face Recognition (1.00)
- Image Understanding (1.00)
- Communications > Networks (1.00)
- Data Science > Data Mining (1.00)
- Hardware (1.00)
- Human Computer Interaction > Interfaces
- Virtual Reality (1.00)
- Sensing and Signal Processing > Image Processing (1.00)
- Artificial Intelligence
- Information Technology