IndEgo: ADataset of Industrial Scenarios and Collaborative Work for Egocentric Assistants
–Neural Information Processing Systems
We introduce IndEgo, a multimodal egocentric and exocentric dataset addressing common industrial tasks, including assembly/disassembly, logistics and organisation, inspection and repair, woodworking, and others. The dataset contains 3,460 egocentric recordings (approximately 197 hours), along with 1,092 exocentric recordings (approximately 97 hours). A key focus of the dataset is collaborative work, where two workers jointly perform cognitively and physically intensive tasks. The egocentric recordings include rich multimodal data and added context via eye gaze, narration, sound, motion, and others. We provide detailed annotations (actions, summaries, mistake annotations, narrations), metadata, processed outputs (eye gaze, hand pose, semi-dense point cloud), and benchmarks on procedural and non-procedural task understanding, Mistake Detection, and reasoning-based Question Answering.
Neural Information Processing Systems
Jun-19-2026, 16:35:53 GMT
- Country:
- Europe (0.67)
- Genre:
- Workflow (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Industry:
- Information Technology > Security & Privacy (0.67)
- Health & Medicine (0.67)
- Technology:
- Information Technology
- Human Computer Interaction (1.00)
- Communications > Collaboration (0.84)
- Artificial Intelligence
- Vision (1.00)
- Robots (1.00)
- Natural Language > Large Language Model (1.00)
- Representation & Reasoning > Agents (0.67)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Information Technology