A Self Validation Network for Object-Level Human Attention Estimation
Zehua Zhang, Chen Yu, David Crandall
–Neural Information Processing Systems
Some recent work [22, 66, 68] has discussed estimating probability maps of ego-attention or predicting gaze points in egocentric videos. However, people think not in terms of points in their field of view, but in terms of theobjects that they are attending to. Of course, the object of interest could be obtained by first estimating the gaze with the gaze estimator and generating object candidates from an off-theshelf object detector, and then picking the object that the estimated gaze falls in. Because this bottom-up approach estimateswhere and what separately, it could be doomed to fail if the eye gaze prediction is slightly inaccurate, such as falling between two objects or in the intersection ofmultiple object bounding boxes (Figure1).
Neural Information Processing Systems
Feb-11-2026, 23:57:21 GMT
- Country:
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence