AITopics | label word

Collaborating Authors

label word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explicit Knowledge-Guided In-Context Learning for Early Detection of Alzheimer's Disease

Su, Puzhen, Miao, Yongzhu, Guo, Chunxi, Tang, Jintao, Li, Shasha, Wang, Ting

arXiv.org Artificial IntelligenceNov-11-2025

Detecting Alzheimer's Disease (AD) from narrative transcripts remains a challenging task for large language models (LLMs), particularly under out-of-distribution (OOD) and data-scarce conditions. While in-context learning (ICL) provides a parameter-efficient alternative to fine-tuning, existing ICL approaches often suffer from task recognition failure, suboptimal demonstration selection, and misalignment between label words and task objectives, issues that are amplified in clinical domains like AD detection. We propose Explicit Knowledge In-Context Learners (EK-ICL), a novel framework that integrates structured explicit knowledge to enhance reasoning stability and task alignment in ICL. EK-ICL incorporates three knowledge components: confidence scores derived from small language models (SLMs) to ground predictions in task-relevant patterns, parsing feature scores to capture structural differences and improve demo selection, and label word replacement to resolve semantic misalignment with LLM priors. In addition, EK-ICL employs a parsing-based retrieval strategy and ensemble prediction to mitigate the effects of semantic homogeneity in AD transcripts. Extensive experiments across three AD datasets demonstrate that EK-ICL significantly outperforms state-of-the-art fine-tuning and ICL baselines. Further analysis reveals that ICL performance in AD detection is highly sensitive to the alignment of label semantics and task-specific context, underscoring the importance of explicit knowledge in clinical reasoning under low-resource conditions.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.06215

Country: Asia > China (0.46)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework

Wei, Xiao, Wang, Xiaobao, Zhuang, Ning, Wang, Chenyang, Wang, Longbiao, dang, Jianwu

arXiv.org Artificial IntelligenceJun-11-2025

Intent detection aims to identify user intents from natural language inputs, where supervised methods rely heavily on labeled in-domain (IND) data and struggle with out-of-domain (OOD) intents, limiting their practical applicability. Generalized Intent Discovery (GID) addresses this by leveraging unlabeled OOD data to discover new intents without additional annotation. However, existing methods focus solely on clustering unsupervised data while neglecting domain adaptation. Therefore, we propose a consistency-driven prototype-prompting framework for GID from the perspective of integrating old and new knowledge, which includes a prototype-prompting framework for transferring old knowledge from external sources, and a hierarchical consistency constraint for learning new knowledge from target domains. We conducted extensive experiments and the results show that our method significantly outperforms all baseline methods, achieving state-of-the-art results, which strongly demonstrates the effectiveness and generalization of our methods. Our source code is publicly available at https://github.com/smileix/cpp.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.0849

Country: Asia > China (0.29)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Inference and Verbalization Functions During In-Context Learning

Tao, Junyi, Chen, Xiaoyin, Liu, Nelson F.

arXiv.org Artificial IntelligenceOct-11-2024

Large language models (LMs) are capable of in-context learning from a few demonstrations (example-label pairs) to solve new tasks during inference. Despite the intuitive importance of high-quality demonstrations, previous work has observed that, in some settings, ICL performance is minimally affected by irrelevant labels (Min et al., 2022). We hypothesize that LMs perform ICL with irrelevant labels via two sequential processes: an inference function that solves the task, followed by a verbalization function that maps the inferred answer to the label space. Importantly, we hypothesize that the inference function is invariant to remappings of the label space (e.g., "true"/"false" to "cat"/"dog"), enabling LMs to share the same inference function across settings with different label words. We empirically validate this hypothesis with controlled layer-wise interchange intervention experiments. Our findings confirm the hypotheses on multiple datasets and tasks (natural language inference, sentiment analysis, and topic classification) and further suggest that the two functions can be localized in specific layers across various open-sourced models, including GEMMA-7B, MISTRAL-7B-V0.3, GEMMA-2-27B, and LLAMA-3.1-70B.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.09349

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Manual Verbalizer Enrichment for Few-Shot Text Classification

Nguyen, Quang Anh, Tomeh, Nadi, Lebbah, Mustapha, Charnois, Thierry, Azzag, Hanene, Muñoz, Santiago Cordoba

arXiv.org Artificial IntelligenceOct-8-2024

With the continuous development of pre-trained language models, prompt-based training becomes a well-adopted paradigm that drastically improves the exploitation of models for many natural language processing tasks. Prompting also shows great performance compared to traditional fine-tuning when adapted to zero-shot or few-shot scenarios where the number of annotated data is limited. In this framework, the role of verbalizers is essential, as an interpretation from masked word distributions into output predictions. In this work, we propose \acrshort{mave}, an approach for verbalizer construction by enrichment of class labels using neighborhood relation in the embedding space of words for the text classification task. In addition, we elaborate a benchmarking procedure to evaluate typical baselines of verbalizers for document classification in few-shot learning contexts. Our model achieves state-of-the-art results while using significantly fewer resources. We show that our approach is particularly effective in cases with extremely limited supervision data.

artificial intelligence, natural language, text classification, (18 more...)

arXiv.org Artificial Intelligence

2410.06173

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (0.68)
Media (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)

Add feedback

Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian

Auriemma, Serena, Miliani, Martina, Madeddu, Mauro, Bondielli, Alessandro, Passaro, Lucia, Lenci, Alessandro

arXiv.org Artificial IntelligenceJul-30-2024

Pre-trained LMs have had a significant impact on Natural Language Processing (NLP), with the "pre-train and fine-tune" paradigm rapidly becoming the predominant approach to apply effective models on a wide variety of downstream tasks [1-3, inter alia]. However, one of the main concerns when working with LMs is the paucity of annotated data, especially for specific domains or low-resource languages, required to fine-tune the additional classification layer on top of these models for downstream tasks, such as classification. Recently, prompt-based tuning has started to affirm as a promising way to perform similar tasks, significantly reducing the need for annotated data. This approach has been proven to be very effective with Large Language Models (LLMs) [4]. However, it is often the case that LLMs are not available for low-resource languages, and that their performance drastically decreases when they are challenged on specific domains. Moreover, in the Digital Transformation era, businesses frequently need to integrate artificial intelligence systems into their application ecosystems. This requires them to utilize specialized, publicly available models while also employing effective methods to leverage these models in scenarios where annotated language resources are unavailable, thereby operating in a zero-shot mode. Hence, we decided to evaluate two smaller domain-specific encoder models: BureauBERTo [5], a LM further pre-trained on Italian bureaucratic texts (i.e., administrative acts, banking and insurance documents), and Italian Legal BERT [6] (henceforth referred to as Ita-Legal-BERT), a LM adapted to the Italian legal domain, on various classification tasks on domain-specific data exploiting a prompt-based technique in a zero-shot scenario. Additionally, we compared the performance of both models with that of a generic Italian model, UmBERTo.

calibration, knowledgeable verbalizer, verbalizer, (17 more...)

arXiv.org Artificial Intelligence

2407.20654

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Law > Statutes (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MICL: Improving In-Context Learning through Multiple-Label Words in Demonstration

Zixiao, Zhu, Zijian, Feng, Hanzhang, Zhou, Junlang, Qian, Kezhi, Mao

arXiv.org Artificial IntelligenceJun-16-2024

In-context learning (ICL) enables large language models (LLMs) to perform new tasks by using sample-label pairs as demonstrations. However, variations in demonstrations can lead to significantly different performances. Current research mainly focuses on selecting demonstration samples, preassuming the class name to be the label word when creating sample-label pairs. However, the choice of label words is crucial for ICL performance. In addition, we observe that using a single class name in demonstration may not yield optimal results. In this paper, we propose to use multiple label words in one sample-label pair to enhance ICL performance. Further, we select and order sample-label pairs based on LLM's output distribution, aiming to optimize the demonstration examples from both the samples' and labels' perspectives. Evaluation results on seven classification datasets show that the use of multiple label words, strategically organized by their selection, order and quantity, improves ICL performance through diverse label information.

class name, demonstration, label word, (16 more...)

arXiv.org Artificial Intelligence

2406.10908

Country:

Asia > Singapore (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Prompting Decoder Helps Better Language Understanding

Cheng, Zifeng, Chen, Zhaoling, Jiang, Zhiwei, Yin, Yafeng, Ge, Shiping, Liu, Yuliang, Gu, Qing

arXiv.org Artificial IntelligenceJun-10-2024

Recent Pre-trained Language Models (PLMs) usually only provide users with the inference APIs, namely the emerging Model-as-a-Service (MaaS) setting. To adapt MaaS PLMs to downstream tasks without accessing their parameters and gradients, some existing methods focus on the output-side adaptation of PLMs, viewing the PLM as an encoder and then optimizing a task-specific decoder for decoding the output hidden states and class scores of the PLM. Despite the effectiveness of these methods, they only use a single prompt to query PLMs for decoding, leading to a heavy reliance on the quality of the adopted prompt. In this paper, we propose a simple yet effective Multi-Prompting Decoder (MPD) framework for MaaS adaptation. The core idea is to query PLMs with multiple different prompts for each sample, thereby obtaining multiple output hidden states and class scores for subsequent decoding. Such multi-prompting decoding paradigm can simultaneously mitigate reliance on the quality of a single prompt, alleviate the issue of data scarcity under the few-shot setting, and provide richer knowledge extracted from PLMs. Specifically, we propose two decoding strategies: multi-prompting decoding with optimal transport for hidden states and calibrated decoding for class scores. Extensive experiments demonstrate that our method achieves new state-of-the-art results on multiple natural language understanding datasets under the few-shot setting.

class score, plm, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2406.06279

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network

Yuan, Shuzhou, Nie, Ercong, Färber, Michael, Schmid, Helmut, Schütze, Hinrich

arXiv.org Artificial IntelligenceJun-7-2024

Large Language Models (LLMs) exhibit strong In-Context Learning (ICL) capabilities when prompts with demonstrations are used. However, fine-tuning still remains crucial to further enhance their adaptability. Prompt-based fine-tuning proves to be an effective fine-tuning method in low-data scenarios, but high demands on computing resources limit its practicality. We address this issue by introducing a prompt-based parameter-efficient fine-tuning (PEFT) approach. GNNavi leverages insights into ICL's information flow dynamics, which indicates that label words act in prompts as anchors for information propagation. GNNavi employs a Graph Neural Network (GNN) layer to precisely guide the aggregation and distribution of information flow during the processing of prompts by hardwiring the desired information flow into the GNN. Our experiments on text classification tasks with GPT-2 and Llama2 show GNNavi surpasses standard prompt-based fine-tuning methods in few-shot settings by updating just 0.2% to 0.5% of parameters. We compare GNNavi with prevalent PEFT approaches, such as prefix tuning, LoRA and Adapter in terms of performance and efficiency. Our analysis reveals that GNNavi enhances information flow and ensures a clear aggregation process.

computational linguistic, information flow, training example, (12 more...)

arXiv.org Artificial Intelligence

2402.11709

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(12 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Individual Text Corpora Predict Openness, Interests, Knowledge and Level of Education

Hofmann, Markus J., Jansen, Markus T., Wigbels, Christoph, Briesemeister, Benny, Jacobs, Arthur M.

arXiv.org Artificial IntelligenceMar-29-2024

Here we examine whether the personality dimension of openness to experience can be predicted from the individual google search history. By web scraping, individual text corpora (ICs) were generated from 214 participants with a mean number of 5 million word tokens. We trained word2vec models and used the similarities of each IC to label words, which were derived from a lexical approach of personality. These IC-label-word similarities were utilized as predictive features in neural models. For training and validation, we relied on 179 participants and held out a test sample of 35 participants. A grid search with varying number of predictive features, hidden units and boost factor was performed. As model selection criterion, we used R2 in the validation samples penalized by the absolute R2 difference between training and validation. The selected neural model explained 35% of the openness variance in the test sample, while an ensemble model with the same architecture often provided slightly more stable predictions for intellectual interests, knowledge in humanities and level of education. Finally, a learning curve analysis suggested that around 500 training participants are required for generalizable predictions. We discuss ICs as a complement or replacement of survey-based psychodiagnostics.

correlation, openness, participant, (17 more...)

arXiv.org Artificial Intelligence

2404.00165

Country:

North America > United States > Hawaii (0.04)
Europe > Germany > Thuringia > Erfurt (0.04)
Europe > Germany > Saxony > Leipzig (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.66)

Add feedback

Filters

Collaborating Authors

label word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Explicit Knowledge-Guided In-Context Learning for Early Detection of Alzheimer's Disease

f0722b58f02d7793acf7d328928f933a-Supplemental-Conference.pdf

Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework

Inference and Verbalization Functions During In-Context Learning

Manual Verbalizer Enrichment for Few-Shot Text Classification

Prompting Encoder Models for Zero-Shot Classification: A Cross-Domain Study in Italian

MICL: Improving In-Context Learning through Multiple-Label Words in Demonstration

Multi-Prompting Decoder Helps Better Language Understanding

GNNavi: Navigating the Information Flow in Large Language Models by Graph Neural Network

Individual Text Corpora Predict Openness, Interests, Knowledge and Level of Education