A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions

Open in new window