model utility
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > California (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (2 more...)
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (2 more...)
- Information Technology > Security & Privacy (1.00)
- Law (0.68)
Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
The techniques of machine unlearning, also known as concept erasing, have been developed to address these risks. However, these techniques remain vulnerable to adversarial prompt attacks, which can prompt DMs post-unlearning to regenerate undesired images containing concepts (such as nudity) meant to be erased. This work aims to enhance the robustness of concept erasing by integrating the principle of adversarial training (AT) into machine unlearning, resulting in the robust unlearning framework referred to as AdvUnlearn. However, achieving this effectively and efficiently is highly nontrivial. First, we find that a straightforward implementation of AT compromises DMs' image generation quality post-unlearning. To address this, we develop a utility-retaining regularization on an additional retain set, optimizing the trade-off between concept erasure robustness and model utility in AdvUnlearn.
SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities
Nguyen, Dung Thuy, Nguyen, Quang, Robinette, Preston K., Jiang, Eli, Johnson, Taylor T., Leach, Kevin
Recent advances in 3D-aware generative models have enabled high-fidelity image synthesis of human identities. However, this progress raises urgent questions around user consent and the ability to remove specific individuals from a model's output space. W e address this by introducing SUGAR, a framework for scalable generative unlearning that enables the removal of many identities (simultaneously or sequentially) without retraining the entire model. Rather than projecting unwanted identities to unrealistic outputs or relying on static template faces, SUGAR learns a personalized surrogate latent for each identity, diverting reconstructions to visually coherent alternatives while preserving the model's quality and diversity. W e further introduce a continual utility preservation objective that guards against degradation as more identities are forgotten. SUGAR achieves state-of-the-art performance in removing up to 200 identities, while delivering up to a 700% improvement in retention utility compared to existing baselines.
- North America > United States (0.67)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
- Information Technology > Security & Privacy (1.00)
- Law (0.93)
- Government > Regional Government > North America Government > United States Government (0.46)
MAR-FL: A Communication Efficient Peer-to-Peer Federated Learning System
Mulitze, Felix, Woisetschläger, Herbert, Jacobsen, Hans Arno
The convergence of next-generation wireless systems and distributed Machine Learning (ML) demands Federated Learning (FL) methods that remain efficient and robust with wireless connected peers and under network churn. Peer-to-peer (P2P) FL removes the bottleneck of a central coordinator, but existing approaches suffer from excessive communication complexity, limiting their scalability in practice. We introduce MAR-FL, a novel P2P FL system that leverages iterative group-based aggregation to substantially reduce communication overhead while retaining resilience to churn. MAR-FL achieves communication costs that scale as O(N log N), contrasting with the O(N^2) complexity of previously existing baselines, and thereby maintains effectiveness especially as the number of peers in an aggregation round grows. The system is robust towards unreliable FL clients and can integrate private computing.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Norway > Norwegian Sea (0.04)
When Forgetting Builds Reliability: LLM Unlearning for Reliable Hardware Code Generation
Liang, Yiwen, Li, Qiufeng, Wang, Shikai, Cao, Weidong
Large Language Models (LLMs) have shown strong potential in accelerating digital hardware design through automated code generation. Yet, ensuring their reliability remains a critical challenge, as existing LLMs trained on massive heterogeneous datasets often exhibit problematic memorization of proprietary intellectual property (IP), contaminated benchmarks, and unsafe coding patterns. To mitigate these risks, we propose a novel unlearning framework tailored for LLM-based hardware code generation. Our method combines (i) a syntax-preserving unlearning strategy that safeguards the structural integrity of hardware code during forgetting, and (ii) a fine-grained floor-aware selective loss that enables precise and efficient removal of problematic knowledge. This integration achieves effective unlearning without degrading LLM code generation capabilities. Extensive experiments show that our framework supports forget sets up to 3x larger, typically requiring only a single training epoch, while preserving both syntactic correctness and functional integrity of register-transfer level (RTL) codes. Our work paves an avenue towards reliable LLM-assisted hardware design.
- North America > United States > District of Columbia > Washington (0.04)
- Europe (0.04)
- Asia (0.04)
- Information Technology (0.68)
- Law (0.48)
DRAGON: Guard LLM Unlearning in Context via Negative Detection and Reasoning
Wang, Yaxuan, Liu, Chris Yuhao, Liu, Quan, Pang, Jinglong, Wei, Wei, Bao, Yujia, Liu, Yang
Unlearning in Large Language Models (LLMs) is crucial for protecting private data and removing harmful knowledge. Most existing approaches rely on fine-tuning to balance unlearning efficiency with general language capabilities. However, these methods typically require training or access to retain data, which is often unavailable in real world scenarios. Although these methods can perform well when both forget and retain data are available, few works have demonstrated equivalent capability in more practical, data-limited scenarios. To overcome these limitations, we propose Detect-Reasoning Augmented GeneratiON (DRAGON), a systematic, reasoning-based framework that utilizes in-context chain-of-thought (CoT) instructions to guard deployed LLMs before inference. Instead of modifying the base model, DRAGON leverages the inherent instruction-following ability of LLMs and introduces a lightweight detection module to identify forget-worthy prompts without any retain data. These are then routed through a dedicated CoT guard model to enforce safe and accurate in-context intervention. To robustly evaluate unlearning performance, we introduce novel metrics for unlearning performance and the continual unlearning setting. Extensive experiments across three representative unlearning tasks validate the effectiveness of DRAGON, demonstrating its strong unlearning capability, scalability, and applicability in practical scenarios.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
- Asia > Middle East > Kuwait > Capital Governorate > Kuwait City (0.04)
- Asia > Middle East > Jordan (0.04)
- (4 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government (0.92)
PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis
Hou, Mingliang, Wang, Yinuo, Guo, Teng, Liu, Zitao, Dou, Wenzhou, Zheng, Jiaqi, Luo, Renqiang, Tian, Mi, Luo, Weiqi
The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users' growing assertion of their "right to be forgotten". However, existing CD models are largely designed without privacy considerations and lack effective data unlearning mechanisms. Directly applying general-purpose unlearning algorithms is suboptimal, as they struggle to balance unlearning completeness, model utility, and efficiency when confronted with the unique heterogeneous structure of CD models. To address this, our paper presents the first systematic study of the data unlearning problem for CD models, proposing a novel and efficient algorithm: hierarchical importance-guided forgetting (HIF). Our key insight is that parameter importance in CD models exhibits distinct layer-wise characteristics. HIF leverages this via an innovative smoothing mechanism that combines individual and layer-level importance, enabling a more precise distinction of parameters associated with the data to be unlearned. Experiments on three real-world datasets show that HIF significantly outperforms baselines on key metrics, offering the first effective solution for CD models to respond to user data removal requests and for deploying high-performance, privacy-preserving AI systems.
- Asia > China > Jilin Province > Changchun (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Africa > Guinea > Kankan Region > Kankan Prefecture > Kankan (0.04)
- Information Technology > Security & Privacy (1.00)
- Education > Health & Safety > School Safety & Security > School Violence (0.40)
ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language Models
Jin, Weifei, Cao, Yuxin, Su, Junjie, Xue, Minhui, Hao, Jie, Xu, Ke, Dong, Jin Song, Wang, Derui
Recent advances in Audio-Language Models (ALMs) have significantly improved multimodal understanding capabilities. However, the introduction of the audio modality also brings new and unique vulnerability vectors. Previous studies have proposed jailbreak attacks that specifically target ALMs, revealing that defenses directly transferred from traditional audio adversarial attacks or text-based Large Language Model (LLM) jailbreaks are largely ineffective against these ALM-specific threats. To address this issue, we propose ALMGuard, the first defense framework tailored to ALMs. Based on the assumption that safety-aligned shortcuts naturally exist in ALMs, we design a method to identify universal Shortcut Activation Perturbations (SAPs) that serve as triggers that activate the safety shortcuts to safeguard ALMs at inference time. To better sift out effective triggers while preserving the model's utility on benign tasks, we further propose Mel-Gradient Sparse Mask (M-GSM), which restricts perturbations to Mel-frequency bins that are sensitive to jailbreaks but insensitive to speech understanding. Both theoretical analyses and empirical results demonstrate the robustness of our method against both seen and unseen attacks. Overall, \MethodName reduces the average success rate of advanced ALM-specific jailbreak attacks to 4.6% across four models, while maintaining comparable utility on benign benchmarks, establishing it as the new state of the art. Our code and data are available at https://github.com/WeifeiJin/ALMGuard.
- North America > United States (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > Montserrat (0.04)
- Asia > Singapore (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Law Enforcement & Public Safety (0.93)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)