AITopics | familiarity score

Collaborating Authors

familiarity score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Zero-Resource Hallucination Prevention for Large Language Models

Luo, Junyu, Xiao, Cao, Ma, Fenglong

arXiv.org Artificial IntelligenceOct-7-2023

The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination," which refers to instances where LLMs generate factually inaccurate or ungrounded information. Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based methods that suffer from interpretability issues. Additionally, the methods that identify hallucinations post-generation could not prevent their occurrence and suffer from inconsistent performance due to the influence of the instruction format and model style. In this paper, we introduce a novel pre-detection self-evaluation technique, referred to as SELF-FAMILIARITY, which focuses on evaluating the model's familiarity with the concepts present in the input instruction and withholding the generation of response in case of unfamiliar concepts. This approach emulates the human ability to refrain from responding to unfamiliar topics, thus reducing hallucinations. We validate SELF-FAMILIARITY across four different large language models, demonstrating consistently superior performance compared to existing techniques. Our findings propose a significant shift towards preemptive strategies for hallucination mitigation in LLM assistants, promising improvements in reliability, applicability, and interpretability.

arxiv preprint arxiv, familiarity score, language model, (15 more...)

arXiv.org Artificial Intelligence

2309.02654

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Government Relations & Public Policy (0.46)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Building for Tomorrow: Assessing the Temporal Persistence of Text Classifiers

Alkhalifa, Rabab, Kochkina, Elena, Zubiaga, Arkaitz

arXiv.org Artificial IntelligenceNov-19-2022

A supervised text classification model relies on labelled datasets to train the model (Sebastiani, 2002). From an experimental perspective, the design and evaluation of classification models typically rely on data pertaining to fixed periods of time. Recent research demonstrates that such models, while showing competitive performance in their experimental environment, underperform when they need to classify new data that is distant in time from that observed during training (Alkhalifa and Zubiaga, 2022). This deterioration of performance has been demonstrated for different classification tasks, including topic classification (Rocha, Mourão, Pereira, Gonçalves, and Meira, 2008), sentiment classification (Lukes and Søgaard, 2018), hate speech detection (Florio, Basile, Polignano, Basile, and Patti, 2020), stance detection (Alkhalifa, Kochkina, and Zubiaga, 2021) and political ideology detection (Röttger and Pierrehumbert, 2021). This performance drop can happen for multiple reasons, including among others the evolution in language use (Smith, 2004) or the evolution of public opinion (Bonilla and Mo, 2019) and its extent may vary (Alkhalifa et al., 2021). This poses an important challenge and limitation on such models when one plans to continue using the model over a long period of time to classify new, incoming data, as can be the case with a stream of user-generated contents (Cheng, Chen, Lee, and Li, 2021).

machine learning, natural language, text classification, (22 more...)

arXiv.org Artificial Intelligence

2205.05435

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback