OASST-ETC Dataset: Alignment Signals from Eye-tracking Analysis of LLM Responses

Lopez-Cardona, Angela, Idesis, Sebastian, Barreda-Ángeles, Miguel, Abadal, Sergi, Arapakis, Ioannis

Mar-13-2025–arXiv.org Artificial Intelligence

While Large Language Models (LLMs) have significantly advanced natural language processing, aligning them with human preferences remains an open challenge. Although current alignment methods rely primarily on explicit feedback, eye-tracking (ET) data offers insights into real-time cognitive processing during reading. In this paper, we present OASST-ETC, a novel eye-tracking corpus capturing reading patterns from 24 participants, while evaluating LLM-generated responses from the OASST1 dataset. Our analysis reveals distinct reading patterns between preferred and non-preferred responses, which we compare with synthetic eye-tracking data. Furthermore, we examine the correlation between human reading measures and attention patterns from various transformer-based models, discovering stronger correlations in preferred responses. This work introduces a unique resource for studying human cognitive processing in LLM evaluation and suggests promising directions for incorporating eye-tracking data into alignment methods. The dataset and analysis code are publicly available.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Mar-13-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.93)
- Europe (1.00)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (1.00)

Industry:
- Health & Medicine (0.67)
- Information Technology (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)