Concept Layers: Enhancing Interpretability and Intervenability via LLM Conceptualization

Open in new window