Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Huang, Jing, Wu, Zhengxuan, Mahowald, Kyle, Potts, Christopher
–arXiv.org Artificial Intelligence
Language tasks involving character-level manipulations (e.g., spelling corrections, arithmetic operations, word games) are challenging for models operating on subword units. To address this, we develop a causal intervention framework to learn robust and interpretable character representations inside subword-based language models. Our method treats each character as a typed variable in a causal model and learns such causal structures by adapting the interchange intervention training method of Geiger et al. (2021). We additionally introduce a suite of character-level tasks that systematically vary in their dependence on meaning and sequence-level context. While character-level models still perform best on purely form-based tasks like string reversal, our method outperforms character-level models on more complex tasks that blend form, meaning, and context, such as spelling correction in context and word search games. Compared with standard subword-based models, our approach also significantly improves robustness on unseen token sequences and leads to human-interpretable internal representations of characters.
arXiv.org Artificial Intelligence
Dec-19-2023
- Country:
- Europe
- Austria (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Germany > Berlin (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Texas > Travis County
- Austin (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Oceania > Australia
- Europe
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Transportation
- Ground > Road (0.61)
- Infrastructure & Services (0.61)
- Transportation
- Technology: