Semi-Synthetic Transformers for Evaluating Mechanistic Interpretability Techniques

Neural Information Processing Systems 

Training (IIT) which we call Strict IIT (SIIT). SIIT models maintain Tracr's original circuit while being more realistic.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found