Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure

Open in new window