AITopics | similar example

Collaborating Authors

similar example

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unlearning in- vs. out-of-distribution data in LLMs under gradient-based method

Baluta, Teodora, Lamblin, Pascal, Tarlow, Daniel, Pedregosa, Fabian, Dziugaite, Gintare Karolina

arXiv.org Artificial IntelligenceNov-6-2024

Machine unlearning aims to solve the problem of removing the influence of selected training examples from a learned model. Despite the increasing attention to this problem, it remains an open research question how to evaluate unlearning in large language models (LLMs), and what are the critical properties of the data to be unlearned that affect the quality and efficiency of unlearning. This work formalizes a metric to evaluate unlearning quality in generative models, and uses it to assess the trade-offs between unlearning quality and performance. We demonstrate that unlearning out-of-distribution examples requires more unlearning steps but overall presents a better trade-off overall. For in-distribution examples, however, we observe a rapid decay in performance as unlearning progresses. We further evaluate how example's memorization and difficulty affect unlearning under a classical gradient ascent-based approach.

exposure, language model, reference model, (15 more...)

arXiv.org Artificial Intelligence

2411.04388

Country:

Asia > Singapore (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mechanism

Zhao, Anhao, Ye, Fanghua, Fu, Jinlan, Shen, Xiaoyu

arXiv.org Artificial IntelligenceJul-24-2024

Large language models (LLMs) exhibit remarkable in-context learning (ICL) capabilities. However, the underlying working mechanism of ICL remains poorly understood. Recent research presents two conflicting views on ICL: One attributes it to LLMs' inherent ability of task recognition, deeming label correctness and shot numbers of demonstrations as not crucial; the other emphasizes the impact of similar examples in the demonstrations, stressing the need for label correctness and more shots. In this work, we provide a Two-Dimensional Coordinate System that unifies both views into a systematic framework. The framework explains the behavior of ICL through two orthogonal variables: whether LLMs can recognize the task and whether similar examples are presented in the demonstrations. We propose the peak inverse rank metric to detect the task recognition ability of LLMs and study LLMs' reactions to different definitions of similarity. Based on these, we conduct extensive experiments to elucidate how ICL functions across each quadrant on multiple representative classification tasks. Finally, we extend our analyses to generation tasks, showing that our coordinate system can also be used to interpret ICL for generation tasks effectively.

dataset, demonstration, similar example, (16 more...)

arXiv.org Artificial Intelligence

2407.17011

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > France (0.04)
Europe > Germany (0.04)
(55 more...)

Genre: Research Report > New Finding (0.67)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

Data Generation using Large Language Models for Text Classification: An Empirical Case Study

Li, Yinheng, Bonatti, Rogerio, Abdali, Sara, Wagle, Justin, Koishida, Kazuhito

arXiv.org Artificial IntelligenceJun-27-2024

Using Large Language Models (LLMs) to generate synthetic data for model training has become increasingly popular in recent years. While LLMs are capable of producing realistic training data, the effectiveness of data generation is influenced by various factors, including the choice of prompt, task complexity, and the quality, quantity, and diversity of the generated data. In this work, we focus exclusively on using synthetic data for text classification tasks. Specifically, we use natural language understanding (NLU) models trained on synthetic data to assess the quality of synthetic data from different generation approaches. This work provides an empirical analysis of the impact of these factors and offers recommendations for better data generation practices.

please generate, step 1, synthetic data, (14 more...)

arXiv.org Artificial Intelligence

2407.12813

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Retrieval Augmented Instruction Tuning for Open NER with Large Language Models

Xie, Tingyu, Zhang, Jian, Zhang, Yan, Liang, Yuanyuan, Li, Qi, Wang, Hongwei

arXiv.org Artificial IntelligenceJun-25-2024

The strong capability of large language models (LLMs) has been applied to information extraction (IE) through either retrieval augmented prompting or instruction tuning (IT). However, the best way to incorporate information with LLMs for IE remains an open question. In this paper, we explore Retrieval Augmented Instruction Tuning (RA-IT) for IE, focusing on the task of open named entity recognition (NER). Specifically, for each training sample, we retrieve semantically similar examples from the training dataset as the context and prepend them to the input of the original instruction. To evaluate our RA-IT approach more thoroughly, we construct a Chinese IT dataset for open NER and evaluate RA-IT in both English and Chinese scenarios. Experimental results verify the effectiveness of RA-IT across various data sizes and in both English and Chinese scenarios. We also conduct thorough studies to explore the impacts of various retrieval strategies in the proposed RA-IT framework. Code and data are available at: https://github.com/Emma1066/Retrieval-Augmented-IT-OpenNER

inference, instruction, zhang, (16 more...)

arXiv.org Artificial Intelligence

2406.17305

Country:

Asia > Singapore (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Crafting In-context Examples according to LMs' Parametric Knowledge

Lee, Yoonsang, Atreya, Pranav, Ye, Xi, Choi, Eunsol

arXiv.org Artificial IntelligenceNov-16-2023

In-context learning has been applied to knowledge-rich tasks such as question answering. In such scenarios, in-context examples are used to trigger a behaviour in the language model: namely, it should surface information stored in its parametric knowledge. We study the construction of in-context example sets, with a focus on the parametric knowledge of the model regarding in-context examples. We identify 'known' examples, where models can correctly answer from its parametric knowledge, and 'unknown' ones. Our experiments show that prompting with 'unknown' examples decreases the performance, potentially as it encourages hallucination rather than searching its parametric knowledge. Constructing an in-context example set that presents both known and unknown information performs the best across diverse settings. We perform analysis on three multi-answer question answering datasets, which allows us to further study answer set ordering strategies based on the LM's knowledge about each answer. Together, our study sheds lights on how to best construct in-context example sets for knowledge-rich tasks.

dataset, in-context example, parametric knowledge, (14 more...)

arXiv.org Artificial Intelligence

2311.09579

Country:

Africa > South Africa (0.14)
Asia > India (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media (0.93)
Government > Regional Government (0.93)
Education (0.68)
Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.75)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)

Add feedback