Interpreting Pretrained Language Models via Concept Bottlenecks

Open in new window