AITopics | entity recognition model

Collaborating Authors

entity recognition model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TriNER: A Series of Named Entity Recognition Models For Hindi, Bengali & Marathi

Dhamaskar, Mohammed Amaan, Ransing, Rasika

arXiv.org Artificial IntelligenceFeb-6-2025

India's rich cultural and linguistic diversity poses various challenges in the domain of Natural Language Processing (NLP), particularly in Named Entity Recognition (NER). NER is a NLP task that aims to identify and classify tokens into different entity groups like Person, Location, Organization, Number, etc. This makes NER very useful for downstream tasks like context-aware anonymization. This paper details our work to build a multilingual NER model for the three most spoken languages in India - Hindi, Bengali & Marathi. We train a custom transformer model and fine tune a few pretrained models, achieving an F1 Score of 92.11 for a total of 6 entity groups. Through this paper, we aim to introduce a single model to perform NER and significantly reduce the inconsistencies in entity groups and tag names, across the three languages.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

2502.04245

Country:

Asia > India (0.45)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
(4 more...)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Deep Learning Based Named Entity Recognition Models for Recipes

Goel, Mansi, Agarwal, Ayush, Agrawal, Shubham, Kapuriya, Janak, Konam, Akhil Vamshi, Gupta, Rishabh, Rastogi, Shrey, Niharika, null, Bagler, Ganesh

arXiv.org Artificial IntelligenceJun-6-2024

Recipes are cultural capsules transmitted across generations via unstructured text. Automated protocols for recognizing named entities, the building blocks of recipe text, are of immense value for various applications ranging from information extraction to novel recipe generation. Named entity recognition is a technique for extracting information from unstructured or semi-structured data with known labels. Starting with manually-annotated data of 6,611 ingredient phrases, we created an augmented dataset of 26,445 phrases cumulatively. Simultaneously, we systematically cleaned and analyzed ingredient phrases from RecipeDB, the gold-standard recipe data repository, and annotated them using the Stanford NER. Based on the analysis, we sampled a subset of 88,526 phrases using a clustering-based approach while preserving the diversity to create the machine-annotated dataset. A thorough investigation of NER approaches on these three datasets involving statistical, fine-tuning of deep learning-based language models and few-shot prompting on large language models (LLMs) provides deep insights. We conclude that few-shot prompting on LLMs has abysmal performance, whereas the fine-tuned spaCy-transformer emerges as the best model with macro-F1 scores of 95.9%, 96.04%, and 95.71% for the manually-annotated, augmented, and machine-annotated datasets, respectively.

dataset, entity recognition, ingredient phrase, (13 more...)

arXiv.org Artificial Intelligence

2402.17447

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distilling Named Entity Recognition Models for Endangered Species from Large Language Models

Atuhurra, Jesse, Dujohn, Seiveright Cargill, Kamigaito, Hidetaka, Shindo, Hiroyuki, Watanabe, Taro

arXiv.org Artificial IntelligenceMar-13-2024

Natural language processing (NLP) practitioners are leveraging large language models (LLM) to create structured datasets from semi-structured and unstructured data sources such as patents, papers, and theses, without having domain-specific knowledge. At the same time, ecological experts are searching for a variety of means to preserve biodiversity. To contribute to these efforts, we focused on endangered species and through in-context learning, we distilled knowledge from GPT-4. In effect, we created datasets for both named entity recognition (NER) and relation extraction (RE) via a two-stage process: 1) we generated synthetic data from GPT-4 of four classes of endangered species, 2) humans verified the factual accuracy of the synthetic data, resulting in gold data. Eventually, our novel dataset contains a total of 3.6K sentences, evenly divided between 1.8K NER and 1.8K RE sentences. The constructed dataset was then used to fine-tune both general BERT and domain-specific BERT variants, completing the knowledge distillation process from GPT-4 to BERT, because GPT-4 is resource intensive. Experiments show that our knowledge transfer approach is effective at creating a NER model suitable for detecting endangered species from texts.

evaluation, gpt-4, information, (14 more...)

arXiv.org Artificial Intelligence

2403.1543

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.51)

Industry: Law > Environmental Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using Uncertainty

Zhang, Zhen, Zhao, Yuhua, Gao, Hang, Hu, Mengting

arXiv.org Artificial IntelligenceFeb-27-2024

Named Entity Recognition (NER) serves as a fundamental task in natural language understanding, bearing direct implications for web content analysis, search engines, and information retrieval systems. Fine-tuned NER models exhibit satisfactory performance on standard NER benchmarks. However, due to limited fine-tuning data and lack of knowledge, it performs poorly on unseen entity recognition. As a result, the usability and reliability of NER models in web-related applications are compromised. Instead, Large Language Models (LLMs) like GPT-4 possess extensive external knowledge, but research indicates that they lack specialty for NER tasks. Furthermore, non-public and large-scale weights make tuning LLMs difficult. To address these challenges, we propose a framework that combines small fine-tuned models with LLMs (LinkNER) and an uncertainty-based linking strategy called RDC that enables fine-tuned models to complement black-box LLMs, achieving better performance. We experiment with both standard NER test sets and noisy social media datasets. LinkNER enhances NER task performance, notably surpassing SOTA models in robustness tests. We also quantitatively analyze the influence of key components like uncertainty estimation methods, LLMs, and in-context learning on diverse NER tasks, offering specific web-related recommendations.

dataset, linkner, llm, (15 more...)

arXiv.org Artificial Intelligence

2402.10573

Country:

Asia > Singapore > Central Region > Singapore (0.05)
Asia > China > Tianjin Province > Tianjin (0.05)
North America > Canada > Manitoba (0.05)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training a Named Entity Recognition Model Without Data

#artificialintelligenceFeb-12-2023, 01:05:18 GMT

Named Entity Recognition(NER) is the task of recognizing entity names, such as person name, locations, and organizations, within a text. This task serves as a fundamental module for various NLP applications including chatbots, search engines, and translation systems. We can find NER datasets for generic entities easily, but obtaining data for specific domains can be challenging. Labeling NER data is more difficult than simple text classification, making it challenging to create large-scale domain-specific NER datasets. In this post, I will demonstrate how to train NER model without any labeled data.

dataset, entity name, ner dataset, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)

Add feedback

Researchers claim bias in AI named entity recognition models

#artificialintelligenceAug-12-2020, 16:20:21 GMT

Twitter researchers claim to have found evidence of demographic bias in named entity recognition, the first step toward generating automated knowledge bases, or the repositories leveraged by services like search engines. They say their analysis reveals AI performs better at identifying names from specific groups and the biases manifest in syntax, semantics, and how word uses vary across linguistic contexts. Knowledge bases are essentially databases containing information about entities -- people, places, and things. In 2012, Google launched a knowledge base -- the Knowledge Graph -- to enhance Google search results with hundreds of billions of facts gathered from sources including Wikipedia, Wikidata, and CIA World Factbook. Microsoft provides a knowledge base with over 150,000 articles created by support professionals who've resolved issues for its customers. But while the usefulness of knowledge bases is not in dispute, the researchers assert the embeddings used to represent entities in them exhibit bias against certain groups of people.

entity recognition model, information retrieval, natural language, (12 more...)

#artificialintelligence

Country: North America > United States > Massachusetts (0.05)

Genre: Research Report > New Finding (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.98)

Add feedback