Lissard: Long and Simple Sequential Reasoning Datasets
Bueno, Mirelle, Lotufo, Roberto, Nogueira, Rodrigo
–arXiv.org Artificial Intelligence
The efficacy of language models, particularly in reasoning tasks, is significantly impacted by longer text lengths than those seen in training [19, 2, 15]. This phenomenon, referred to as "Length Generalization" or "Length Extrapolation" in the literature [25, 30], is also common in models based on the Transformer architecture [20, 16, 8, 32]. Notably, even Large Language Models (LLMs), known for their strong performance in a wide range of tasks and domains, are not immune to this problem [2, 5]. Recent research tried to address this challenge by modifications to the positional embeddings [25, 6, 7, 19, 13] or by using prompting strategies such as scratchpad [23] and chain-of-thought reasoning [28]. Nevertheless, there remains a lack of datasets specifically designed for the systematic evaluation of the problem.
arXiv.org Artificial Intelligence
Feb-12-2024
- Country:
- Asia > Middle East
- UAE (0.14)
- North America > Canada (0.14)
- Asia > Middle East
- Genre:
- Research Report (1.00)
- Technology: