Mechanistic Interpretability of GPT-2: Lexical and Contextual Layers in Sentiment Analysis

Open in new window