ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Open in new window