A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

Open in new window