Towards Unifying Interpretability and Control: Evaluation via Intervention

Open in new window