894403f9604374a7a003063e480f65b9-Paper-Conference.pdf

Jun-19-2026, 11:31:29 GMT–Neural Information Processing Systems

Transformers have theoretical limitations in modeling certain sequence-to-sequence tasks, yet it remains largely unclear if these limitations play a role in large-scale pretrained LLMs, or whether LLMs might effectively overcome these constraints in practice due to the scale of both the models themselves and their pretraining data. We explore how these architectural constraints manifest after pretraining, by studying a family of retrieval and copying tasks inspired by Liu et al. [2024a]. We use a recently proposed framework for studying length generalization [Huang et al., 2025] to provide guarantees for each of our settings.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Jun-19-2026, 11:31:29 GMT

Conferences PDF

Add feedback

Country:
- Europe > Austria (0.28)
- North America > United States (0.27)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.93)

Industry:
- Health & Medicine (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (0.93)
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found