Towards Understanding How Transformer Perform Multi-step Reasoning with Matching Operation

Open in new window