ToMMeR -- Efficient Entity Mention Detection from Large Language Models

Morand, Victor, Tomeh, Nadi, Mothe, Josiane, Piwowarski, Benjamin

Oct-23-2025–arXiv.org Artificial Intelligence

Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93\% recall zero-shot, with over 90\% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75\%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87\% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-23-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States
  - Minnesota (0.28)

Genre:
- Research Report > New Finding (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)