AITopics | retain set

Collaborating Authors

retain set

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models

Niu, Peizhi, Ma, Evelyn, Zhou, Huiting, Zhou, Duo, Zhang, Huan, Etesami, S. Rasoul, Milenkovic, Olgica

arXiv.org Artificial IntelligenceOct-23-2025

Unlearning in large language models is becoming increasingly important due to regulatory compliance, copyright protection, and privacy concerns. However, a key challenge in LLM unlearning is unintended forgetting, where the removal of specific data inadvertently impairs the utility of the model and its retention of valuable, desired information. While prior work has primarily focused on architectural innovations, the influence of data-level factors on unlearning performance remains underexplored. As a result, existing methods often suffer from degraded retention when forgetting high-impact data. To address this problem, we propose GUARD, a novel framework for Guided Unlearning And Retention via Data attribution. At its core, GUARD introduces a lightweight proxy data attribution metric tailored for LLM unlearning, which quantifies the alignment between the Forget and Retain sets while remaining computationally efficient. Building on this, we design a novel unlearning objective that assigns adaptive, nonuniform unlearning weights to samples, inversely proportional to their proxy attribution scores. Through such a reallocation of unlearning power, GUARD mitigates unintended retention loss. We also provide rigorous theoretical guarantees that GUARD significantly improves retention while maintaining forgetting metrics comparable to prior methods. Extensive experiments on the TOFU and MUSE benchmarks across multiple LLM architectures demonstrate that GUARD reduces utility sacrifice on the TOFU Retain Set by up to 194.92 percent in terms of Truth Ratio when forgetting 10 percent of the training data, and improves knowledge retention on the MUSE NEWS Retain Set by 16.20 percent, with comparable or very moderate increases in privacy loss compared to state-of-the-art methods.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.10946

Country: North America > United States > Illinois (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models

Liu, Zheyuan, Dou, Guangyao, Yuan, Xiangchi, Zhang, Chunhui, Tan, Zhaoxuan, Jiang, Meng

arXiv.org Artificial IntelligenceFeb-21-2025

Generative models such as Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) trained on massive datasets can lead them to memorize and inadvertently reveal sensitive information, raising ethical and privacy concerns. While some prior works have explored this issue in the context of LLMs, it presents a unique challenge for MLLMs due to the entangled nature of knowledge across modalities, making comprehensive unlearning more difficult. To address this challenge, we propose Modality Aware Neuron Unlearning (MANU), a novel unlearning framework for MLLMs designed to selectively clip neurons based on their relative importance to the targeted forget data, curated for different modalities. Specifically, MANU consists of two stages: important neuron selection and selective pruning. The first stage identifies and collects the most influential neurons across modalities relative to the targeted forget knowledge, while the second stage is dedicated to pruning those selected neurons. MANU effectively isolates and removes the neurons that contribute most to the forget data within each modality, while preserving the integrity of retained knowledge. Our experiments conducted across various MLLM architectures illustrate that MANU can achieve a more balanced and comprehensive unlearning in each modality without largely affecting the overall model utility.

arxiv preprint arxiv, knowledge, model utility, (15 more...)

arXiv.org Artificial Intelligence

2502.1591

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench

Liu, Zheyuan, Dou, Guangyao, Jia, Mengzhao, Tan, Zhaoxuan, Zeng, Qingkai, Yuan, Yongle, Jiang, Meng

arXiv.org Artificial IntelligenceOct-29-2024

Generative models such as Large Language Models (LLM) and Multimodal Large Language models (MLLMs) trained on massive web corpora can memorize and disclose individuals' confidential and private data, raising legal and ethical concerns. While many previous works have addressed this issue in LLM via machine unlearning, it remains largely unexplored for MLLMs. To tackle this challenge, we introduce Multimodal Large Language Model Unlearning Benchmark (MLLMU-Bench), a novel benchmark aimed at advancing the understanding of multimodal machine unlearning. MLLMU-Bench consists of 500 fictitious profiles and 153 profiles for public celebrities, each profile feature over 14 customized question-answer pairs, evaluated from both multimodal (image+text) and unimodal (text) perspectives. The benchmark is divided into four sets to assess unlearning algorithms in terms of efficacy, generalizability, and model utility. Finally, we provide baseline results using existing generative model unlearning algorithms. Surprisingly, our experiments show that unimodal unlearning algorithms excel in generation and cloze tasks, while multimodal unlearning approaches perform better in classification tasks with multimodal inputs.

accuracy, classification, retain set, (15 more...)

arXiv.org Artificial Intelligence

2410.22108

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback