ACL-rlg: A Dataset for Reading List Generation

Aubert-Béduchaud, Julien, Boudin, Florian, Daille, Béatrice, Dufour, Richard

Dec-30-2024–arXiv.org Artificial Intelligence

Familiarizing oneself with a new scientific field and its existing literature can be daunting due to the large amount of available articles. Curated lists of academic references, or reading lists, compiled by experts, offer a structured way to gain a comprehensive overview of a domain or a specific scientific challenge. In this work, we introduce ACL-rlg, the largest open expert-annotated reading list dataset. We also provide multiple baselines for evaluating reading list generation and formally define it as a retrieval task. Our qualitative study highlights the fact that traditional scholarly search engines and indexing methods perform poorly on this task, and GPT-4o, despite showing better results, exhibits signs of potential data contamination.

query, reading list, reading list generation, (14 more...)

arXiv.org Artificial Intelligence

Dec-30-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - France > Pays de la Loire
    - Loire-Atlantique > Nantes (0.05)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Singapore (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture
      - Tokyo (0.14)
    - Kansai > Osaka Prefecture
      - Osaka (0.04)

Genre:
- Research Report (1.00)
- Instructional Material > Course Syllabus & Notes (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.92)
  - Machine Learning > Neural Networks
    - Deep Learning (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found