In-Context Learning of Linear Dynamical Systems with Transformers: Approximation Bounds and Depth-separation
–Neural Information Processing Systems
This paper investigates approximation-theoretic aspects of the in-context learning capability of the transformers in representing a family of noisy linear dynamical systems. Our first theoretical result establishes an upper bound on the approximation error of multi-layer transformers with respect to an L2-testing loss uniformly defined across tasks. This result demonstrates that transformers with logarithmic depth can achieve error bounds comparable with those of the least-squares estimator. In contrast, our second result establishes a non-diminishing lower bound on the approximation error for a class of single-layer linear transformers, which suggests a depth-separation phenomenon for transformers in the in-context learning of dynamical systems.
Neural Information Processing Systems
Jun-23-2026, 03:23:41 GMT
- Country:
- North America > United States > Minnesota (0.28)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.66)
- Research Report
- Technology:
- Information Technology
- Scientific Computing (0.93)
- Data Science (0.67)
- Artificial Intelligence
- Representation & Reasoning (1.00)
- Natural Language > Large Language Model (0.93)
- Vision (0.67)
- Machine Learning
- Statistical Learning (0.93)
- Neural Networks > Deep Learning (0.93)
- Information Technology