Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions

Open in new window