AITopics | entity mention

Collaborating Authors

entity mention

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

1 import bisect 2 import re

Neural Information Processing SystemsApr-24-2026, 11:28:30 GMT

In order to convert the dataset to NER format we suggest tokenizing Tweet text and utilizing the character offsets to identify mention tokens. E.g. just setting up my twttrwith offsets 19and 24, and DBpedia category as Organization, can be converted to the NERBIO format as follows: tokens, starts, ends = tokenize_with_offsets("just setting up my twttr")and then assigning Olabels to all tokens outside the phrase start and end offsets and B-ORG and I-ORG label to all tokens within the phrase offsets. This approach works as long as the tokenizer returned offsets correspond to the offset of the phrase in the original text, i.e. tokenization is non-destructive. See example code in listing 1. A system span must match a gold span exactly to be counted as correct.

machine learning, natural language, tweet, (21 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.99)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.54)

Add feedback

Harnessing Deep LLM Participation for Robust Entity Linking

Hou, Jiajun, Zhang, Chenyu, Meng, Rui

arXiv.org Artificial IntelligenceNov-19-2025

Entity Linking (EL), the task of mapping textual entity mentions to their corresponding entries in knowledge bases, constitutes a fundamental component of natural language understanding. Recent advancements in Large Language Models (LLMs) have demonstrated remarkable potential for enhancing EL performance. Prior research has leveraged LLMs to improve entity disambiguation and input representation, yielding significant gains in accuracy and robustness. However, these approaches typically apply LLMs to isolated stages of the EL task, failing to fully integrate their capabilities throughout the entire process. In this work, we introduce DeepEL, a comprehensive framework that incorporates LLMs into every stage of the entity linking task. Furthermore, we identify that disambiguating entities in isolation is insufficient for optimal performance. To address this limitation, we propose a novel self-validation mechanism that utilizes global contextual information, enabling LLMs to rectify their own predictions and better recognize cohesive relationships among entities within the same sentence. Extensive empirical evaluation across ten benchmark datasets demonstrates that DeepEL substantially outperforms existing state-of-the-art methods, achieving an average improvement of 2.6\% in overall F1 score and a remarkable 4% gain on out-of-domain datasets. These results underscore the efficacy of deep LLM integration in advancing the state-of-the-art in entity linking.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.14181

Country:

North America > United States (1.00)
Asia (0.68)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.48)

Industry:

Information Technology (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Knowledge-augmented Pre-trained Language Models for Biomedical Relation Extraction

Sänger, Mario, Leser, Ulf

arXiv.org Artificial IntelligenceNov-4-2025

Automatic relationship extraction (RE) from biomedical literature is critical for managing the vast amount of scientific knowledge produced each year. In recent years, utilizing pre-trained language models (PLMs) has become the prevalent approach in RE. Several studies report improved performance when incorporating additional context information while fine-tuning PLMs for RE. However, variations in the PLMs applied, the databases used for augmentation, hyper-parameter optimization, and evaluation methods complicate direct comparisons between studies and raise questions about the generalizability of these findings. Our study addresses this research gap by evaluating PLMs enhanced with contextual information on five datasets spanning four relation scenarios within a consistent evaluation framework. We evaluate three baseline PLMs and first conduct extensive hyperparameter optimization. After selecting the top-performing model, we enhance it with additional data, including textual entity descriptions, relational information from knowledge graphs, and molecular structure encodings. Our findings illustrate the importance of i) the choice of the underlying language model and ii) a comprehensive hyperparameter optimization for achieving strong extraction performance. Although inclusion of context information yield only minor overall improvements, an ablation study reveals substantial benefits for smaller PLMs when such external data was included during fine-tuning.

information, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1186/s12859-025-06262-6

2505.00814

Country: Europe (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ToMMeR -- Efficient Entity Mention Detection from Large Language Models

Morand, Victor, Tomeh, Nadi, Mothe, Josiane, Piwowarski, Benjamin

arXiv.org Artificial IntelligenceOct-23-2025

Identifying which text spans refer to entities -- mention detection -- is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93\% recall zero-shot, with over 90\% precision using an LLM as a judge showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75\%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves near SOTA NER performance (80-87\% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in early transformer layers and can be efficiently recovered with minimal parameters.

computational linguistic, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2510.1941

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Contextual Augmentation for Entity Linking using Large Language Models

Vollmers, Daniel, Zahera, Hamada M., Moussallem, Diego, Ngomo, Axel-Cyrille Ngonga

arXiv.org Artificial IntelligenceOct-23-2025

Entity Linking involves detecting and linking entity mentions in natural language texts to a knowledge graph. Traditional methods use a two-step process with separate models for entity recognition and disambiguation, which can be computationally intensive and less effective. We propose a fine-tuned model that jointly integrates entity recognition and disambiguation in a unified framework. Furthermore, our approach leverages large language models to enrich the context of entity mentions, yielding better performance in entity disambiguation. We evaluated our approach on benchmark datasets and compared with several baselines. The evaluation results show that our approach achieves state-of-the-art performance on out-of-domain datasets.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.18888

Country:

North America > United States (0.28)
Europe > Germany (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Thunder-DeID: Accurate and Efficient De-identification Framework for Korean Court Judgments

Hahm, Sungeun, Kim, Heejin, Lee, Gyuseong, Park, Hyunji, Lee, Jaejin

arXiv.org Artificial IntelligenceOct-17-2025

To ensure a balance between open access to justice and personal data protection, the South Korean judiciary mandates the de-identification of court judgments before they can be publicly disclosed. However, the current de-identification process is inadequate for handling court judgments at scale while adhering to strict legal requirements. Additionally, the legal definitions and categorizations of personal identifiers are vague and not well-suited for technical solutions. To tackle these challenges, we propose a de-identification framework called Thunder-DeID, which aligns with relevant laws and practices. Specifically, we (i) construct and release the first Korean legal dataset containing annotated judgments along with corresponding lists of entity mentions, (ii) introduce a systematic categorization of Personally Identifiable Information (PII), and (iii) develop an end-to-end deep neural network (DNN)-based de-identification pipeline. Our experimental results demonstrate that our model achieves state-of-the-art performance in the de-identification of court judgments.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.15266

Country:

Europe (1.00)
Asia > South Korea (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.87)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(10 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and Linking

Ramamoorthy, Sathyanarayanan, Shah, Vishwa, Khanuja, Simran, Sheikh, Zaid, Jie, Shan, Chia, Ann, Chua, Shearman, Neubig, Graham

arXiv.org Artificial IntelligenceOct-17-2025

This paper introduces MERLIN, a novel testbed system for the task of Multilingual Multimodal Entity Linking. The created dataset includes BBC news article titles, paired with corresponding images, in five languages: Hindi, Japanese, Indonesian, Vietnamese, and Tamil, featuring over 7,000 named entity mentions linked to 2,500 unique Wikidata entities. We also include several benchmarks using multilingual and multimodal entity linking methods exploring different language models like LLaMa-2 and Aya-23. Our findings indicate that incorporating visual data improves the accuracy of entity linking, especially for entities where the textual context is ambiguous or insufficient, and particularly for models that do not have strong multilingual abilities. For the work, the dataset, methods are available here at https://github.com/rsathya4802/merlin

computational linguistic, information retrieval, large language model, (21 more...)

arXiv.org Artificial Intelligence

2510.14307

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology (0.68)
Government (0.68)
Leisure & Entertainment > Sports (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.50)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

On the Representations of Entities in Auto-regressive Large Language Models

Morand, Victor, Mothe, Josiane, Piwowarski, Benjamin

arXiv.org Artificial IntelligenceOct-13-2025

Named entities are fundamental building blocks of knowledge in text, grounding factual information and structuring relationships within language. Despite their importance, it remains unclear how Large Language Models (LLMs) internally represent entities. Prior research has primarily examined explicit relationships, but little is known about entity representations themselves. We introduce entity mention reconstruction as a novel framework for studying how LLMs encode and manipulate entities. We investigate whether entity mentions can be generated from internal representations, how multi-token entities are encoded beyond last-token embeddings, and whether these representations capture relational knowledge. Our proposed method, leveraging _task vectors_, allows to consistently generate multi-token mentions from various entity representations derived from the LLMs hidden states. We thus introduce the _Entity Lens_, extending the _logit-lens_ to predict multi-token mentions. Our results bring new evidence that LLMs develop entity-specific mechanisms to represent and manipulate any multi-token entities, including those unseen during training. Our code is avalable at https://github.com/VictorMorand/EntityRepresentations .

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2510.09421

Country:

Europe (0.94)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FoodSEM: Large Language Model Specialized in Food Named-Entity Linking

Gjorgjevikj, Ana, Martinc, Matej, Cenikj, Gjorgjina, Džeroski, Sašo, Seljak, Barbara Koroušić, Eftimov, Tome

arXiv.org Artificial IntelligenceSep-29-2025

This paper introduces FoodSEM, a state-of-the-art fine-tuned open-source large language model (LLM) for named-entity linking (NEL) to food-related ontologies. To the best of our knowledge, food NEL is a task that cannot be accurately solved by state-of-the-art general-purpose (large) language models or custom domain-specific models/systems. Through an instruction-response (IR) scenario, FoodSEM links food-related entities mentioned in a text to several ontologies, including FoodOn, SNOMED-CT, and the Hansard taxonomy. The FoodSEM model achieves state-of-the-art performance compared to related models/systems, with F1 scores even reaching 98% on some ontologies and datasets. The presented comparative analyses against zero-shot, one-shot, and few-shot LLM prompting baselines further highlight FoodSEM's superior performance over its non-fine-tuned version. By making FoodSEM and its related resources publicly available, the main contributions of this article include (1) publishing a food-annotated corpora into an IR format suitable for LLM fine-tuning/evaluation, (2) publishing a robust model to advance the semantic understanding of text in the food domain, and (3) providing a strong baseline on food NEL for future benchmarking.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.22125

Country: Europe > Slovenia (0.16)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: