Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method

Open in new window