RelP: Faithful and Efficient Circuit Discovery in Language Models via Relevance Patching

Open in new window