What to Do Next? Memorizing skills from Egocentric Instructional Video
–arXiv.org Artificial Intelligence
Learning to perform activities through demonstration requires extracting meaningful information about the environment from observations. In this research, we investigate the challenge of planning high-level goal-oriented actions in a simulation setting from an egocentric perspective. W e present a novel task, interactive action planning, and propose an approach that combines topological affordance memory with transformer architecture. The process of memorizing the environment's structure through extracting af-fordances facilitates selecting appropriate actions based on the context. Moreover, the memory model allows us to detect action deviations while accomplishing specific objectives. T o assess the method's versatility, we evaluate it in a realistic interactive simulation environment. Our experimental results demonstrate that the proposed approach learns meaningful representations, resulting in improved performance and robust when action deviations occur .
arXiv.org Artificial Intelligence
Jul-8-2025
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Health & Medicine (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Robots (1.00)
- Natural Language (1.00)
- Vision (0.96)
- Representation & Reasoning > Planning & Scheduling (0.94)
- Cognitive Science (0.93)
- Machine Learning
- Statistical Learning (0.68)
- Neural Networks > Deep Learning (0.34)
- Learning Graphical Models > Undirected Networks
- Markov Models (0.46)
- Information Technology > Artificial Intelligence