Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

Open in new window