Lissard: Long and Simple Sequential Reasoning Datasets

Bueno, Mirelle, Lotufo, Roberto, Nogueira, Rodrigo

Feb-12-2024–arXiv.org Artificial Intelligence

The efficacy of language models, particularly in reasoning tasks, is significantly impacted by longer text lengths than those seen in training [19, 2, 15]. This phenomenon, referred to as "Length Generalization" or "Length Extrapolation" in the literature [25, 30], is also common in models based on the Transformer architecture [20, 16, 8, 32]. Notably, even Large Language Models (LLMs), known for their strong performance in a wide range of tasks and domains, are not immune to this problem [2, 5]. Recent research tried to address this challenge by modifications to the positional embeddings [25, 6, 7, 19, 13] or by using prompting strategies such as scratchpad [23] and chain-of-thought reasoning [28]. Nevertheless, there remains a lack of datasets specifically designed for the systematic evaluation of the problem.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Feb-12-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.14)
- North America > Canada (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.91)
  - Natural Language > Large Language Model (1.00)