EyeGraph: Modularity-aware Spatio Temporal Graph Clustering for Continuous Event-based Eye Tracking

Neural Information Processing Systems 

Continuous tracking of eye movement dynamics plays a significant role in developing a broad spectrum of human-centered applications, such as cognitive skills modeling, biometric user authentication, and foveated rendering. Recently neuromorphic cameras have garnered significant interest in the eye-tracking research community, owing to their sub-microsecond latency in capturing intensity changes resulting from eye movements. Nevertheless, the existing approaches for eventbased eye tracking suffer from several limitations: dependence on RGB frames, label sparsity, and training on datasets collected in controlled lab environments that do not adequately reflect real-world scenarios. To address these limitations, in this paper, we propose a dynamic graph-based approach that uses the event stream for high-fidelity tracking of pupillary movement. We first present EyeGraph, a large-scale, multi-modal near-eye tracking dataset collected using a wearable event camera attached to a head-mounted device from 40 participants - the dataset was curated while mimicking in-the-wild settings, with variations in user movement and ambient lighting conditions. Subsequently, to address the issue of label sparsity, we propose an unsupervised topology-aware spatio-temporal graph clustering approach as a benchmark. We show that our unsupervised approach achieves performance comparable to more onerous supervised approaches while consistently outperforming the conventional clustering-based unsupervised approaches.