Echoes of BERT: Do Modern Language Models Rediscover the Classical NLP Pipeline?
Li, Michael, Subramani, Nishant
–arXiv.org Artificial Intelligence
Large transformer-based language models dominate modern NLP, yet our understanding of how they encode linguistic information relies primarily on studies of early models like BERT and GPT-2. Building on classic BERTology work, we analyze 25 models spanning from classical architectures (BERT, DeBERTa, GPT-2) to modern large language models (Pythia, OLMo-2, Gemma-2, Qwen2.5, Llama-3.1), probing layer-by-layer representations across eight linguistic tasks in English. Consistent with earlier findings, we find that hierarchical organization persists in modern models: early layers capture syntax, middle layers handle semantics and entity-level information, and later layers encode discourse phenomena. We dive deeper, conducting an in-depth multilingual analysis of two specific linguistic properties - lexical identity and inflectional morphology - that help disentangle form from meaning. We find that lexical information concentrates linearly in early layers but becomes increasingly nonlinear deeper in the network, while inflectional information remains linearly accessible throughout all layers. Additional analyses of attention mechanisms, steering vectors, and pretraining checkpoints reveal where this information resides within layers, how it can be functionally manipulated, and how representations evolve during pretraining. Taken together, our findings suggest that, even with substantial advances in LLM technologies, transformer models learn to organize linguistic information in similar ways, regardless of model architecture, size, or training regime, indicating that these properties are important for next token prediction. Our code is available at https://github.com/ml5885/model_internal_sleuthing
arXiv.org Artificial Intelligence
Oct-17-2025
- Country:
- Asia
- Europe
- Bulgaria > Sofia City Province
- Sofia (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Slovenia (0.04)
- Sweden > Östergötland County
- Linköping (0.04)
- Bulgaria > Sofia City Province
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- California > Santa Clara County
- South America
- Colombia > Meta Department
- Villavicencio (0.04)
- Paraguay > Asunción
- Asunción (0.04)
- Colombia > Meta Department
- Genre:
- Research Report > New Finding (1.00)
- Technology: