Separations in the Representational Capabilities of Transformers and Recurrent Architectures

Open in new window