Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Open in new window