Representations as Language: An Information-Theoretic Framework for Interpretability

Open in new window