EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding
–Neural Information Processing Systems
Operating rooms (ORs) demand precise coordination among surgeons, nurses, and equipment in a fast-paced, occlusion-heavy environment, necessitating advanced perception models to enhance safety and efficiency. Existing datasets either provide partial egocentric views or sparse exocentric multi-view context, but do not explore the comprehensive combination of both. We introduce EgoExOR, the first OR dataset and accompanying benchmark to fuse first-person and thirdperson perspectives. Spanning 94 minutes (84,553 frames at 15 FPS) of two emulated spine procedures, Ultrasound-Guided Needle Insertion and Minimally Invasive Spine Surgery, EgoExOR integrates egocentric data (RGB, gaze, hand tracking, audio) from wearable glasses, exocentric RGB and depth from RGB-D cameras, and ultrasound imagery. Its detailed scene graph annotations, covering 36 entities and 22 relations (568,235 triplets), enable robust modeling of clinical interactions, supporting tasks like action recognition and human-centric perception. We evaluate the surgical scene graph generation performance of two adapted state-of-the-art models and offer a new baseline that explicitly leverages EgoExOR's multimodal and multi-perspective signals. This new dataset and benchmark set a new foundation for OR perception, offering a rich, multimodal resource for next-generation clinical perception.
Neural Information Processing Systems
Jun-17-2026, 15:43:26 GMT
- Country:
- Europe (0.29)
- North America > United States (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Health & Medicine
- Surgery (1.00)
- Diagnostic Medicine > Imaging (0.47)
- Therapeutic Area
- Musculoskeletal (0.66)
- Orthopedics/Orthopedic Surgery (0.48)
- Neurology (0.46)
- Health & Medicine
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Robots (1.00)
- Natural Language (1.00)
- Machine Learning > Neural Networks
- Deep Learning (0.93)
- Information Technology > Artificial Intelligence