ToMMeR -- Efficient Entity Mention Detection from Large Language Models
Morand, Victor, Tomeh, Nadi, Mothe, Josiane, Piwowarski, Benjamin
–arXiv.org Artificial Intelligence
Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93\% recall zero-shot, with over 90\% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75\%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87\% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.
arXiv.org Artificial Intelligence
Oct-23-2025
- Country:
- Asia
- China (0.04)
- Macao (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Europe
- Austria > Vienna (0.14)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- France
- Occitanie > Haute-Garonne
- Toulouse (0.04)
- Île-de-France > Paris
- Paris (0.04)
- Occitanie > Haute-Garonne
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Switzerland > Geneva
- Geneva (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Florida > Miami-Dade County
- Canada > Ontario
- Asia
- Genre:
- Research Report > New Finding (0.67)
- Technology: