AITopics | Liu, Jiancheng

Collaborating Authors

Liu, Jiancheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image Classification

Sun, Changchang, Wang, Ren, Zhang, Yihua, Jia, Jinghan, Liu, Jiancheng, Liu, Gaowen, Liu, Sijia, Yan, Yan

arXiv.org Artificial IntelligenceJan-8-2025

Machine unlearning (MU), which seeks to erase the influence of specific unwanted data from already-trained models, is becoming increasingly vital in model editing, particularly to comply with evolving data regulations like the ``right to be forgotten''. Conventional approaches are predominantly model-based, typically requiring retraining or fine-tuning the model's weights to meet unlearning requirements. In this work, we approach the MU problem from a novel input perturbation-based perspective, where the model weights remain intact throughout the unlearning process. We demonstrate the existence of a proactive input-based unlearning strategy, referred to forget vector, which can be generated as an input-agnostic data perturbation and remains as effective as model-based approximate unlearning approaches. We also explore forget vector arithmetic, whereby multiple class-specific forget vectors are combined through simple operations (e.g., linear combinations) to generate new forget vectors for unseen unlearning tasks, such as forgetting arbitrary subsets across classes. Extensive experiments validate the effectiveness and adaptability of the forget vector, showcasing its competitive performance relative to state-of-the-art model-based methods. Codes are available at https://github.com/Changchangsun/Forget-Vector.

forget vector, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.1678

Country: North America > United States > Illinois (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.35)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

Fan, Chongyu, Liu, Jiancheng, Lin, Licong, Jia, Jinghan, Zhang, Ruiqi, Mei, Song, Liu, Sijia

arXiv.org Artificial IntelligenceOct-28-2024

In this work, we address the problem of large language model (LLM) unlearning, aiming to remove unwanted data influences and associated model capabilities (e.g., copyrighted data or harmful content generation) while preserving essential model utilities, without the need for retraining from scratch. Despite the growing need for LLM unlearning, a principled optimization framework remains lacking. To this end, we revisit the state-of-the-art approach, negative preference optimization (NPO), and identify the issue of reference model bias, which could undermine NPO's effectiveness, particularly when unlearning forget data of varying difficulty. Given that, we propose a simple yet effective unlearning optimization framework, called SimNPO, showing that 'simplicity' in removing the reliance on a reference model (through the lens of simple preference optimization) benefits unlearning. We also provide deeper insights into SimNPO's advantages, supported by analysis using mixtures of Markov chains. Furthermore, we present extensive experiments validating SimNPO's superiority over existing unlearning baselines in benchmarks like TOFU and MUSE, and robustness against relearning attacks. Codes are available at https://github.com/OPTML-Group/Unlearn-Simple.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.07163

Country: North America > United States (0.67)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models

Jia, Jinghan, Liu, Jiancheng, Zhang, Yihua, Ram, Parikshit, Baracaldo, Nathalie, Liu, Sijia

arXiv.org Artificial IntelligenceOct-22-2024

The need for effective unlearning mechanisms in large language models (LLMs) is increasingly urgent, driven by the necessity to adhere to data regulations and foster ethical generative AI practices. Despite growing interest of LLM unlearning, much of the existing research has focused on varied unlearning method designs to boost effectiveness and efficiency. However, the inherent relationship between model weights and LLM unlearning has not been extensively examined. In this paper, we systematically explore how model weights interact with unlearning processes in LLMs and we design the weight attribution-guided LLM unlearning method, WAGLE, which unveils the interconnections between 'influence' of weights and 'influence' of data to forget and retain in LLM generation. By strategically guiding the LLM unlearning across different types of unlearning methods and tasks, WAGLE can erase the undesired content, while maintaining the performance of the original tasks. We refer to the weight attribution-guided LLM unlearning method as WAGLE, which unveils the interconnections between 'influence' of weights and 'influence' of data to forget and retain in LLM generation. Our extensive experiments show that WAGLE boosts unlearning performance across a range of LLM unlearning methods such as gradient difference and (negative) preference optimization, applications such as fictitious unlearning, malicious use prevention, and copyrighted information removal, and models including Zephyr-7b-beta and Llama2-7b. To the best of our knowledge, our work offers the first principled method for attributing and pinpointing the influential weights in enhancing LLM unlearning. It stands in contrast to previous methods that lack weight attribution and simpler weight attribution techniques.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.17509

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Fan, Chongyu, Liu, Jiancheng, Hero, Alfred, Liu, Sijia

arXiv.org Artificial IntelligenceJul-8-2024

The trustworthy machine learning (ML) community is increasingly recognizing the crucial need for models capable of selectively 'unlearning' data points after training. This leads to the problem of machine unlearning (MU), aiming to eliminate the influence of chosen data points on model performance, while still maintaining the model's utility post-unlearning. Despite various MU methods for data influence erasure, evaluations have largely focused on random data forgetting, ignoring the vital inquiry into which subset should be chosen to truly gauge the authenticity of unlearning performance. To tackle this issue, we introduce a new evaluative angle for MU from an adversarial viewpoint. We propose identifying the data subset that presents the most significant challenge for influence erasure, i.e., pinpointing the worst-case forget set. Utilizing a bi-level optimization principle, we amplify unlearning challenges at the upper optimization level to emulate worst-case scenarios, while simultaneously engaging in standard training and unlearning at the lower level, achieving a balance between data influence erasure and model utility. Our proposal offers a worst-case evaluation of MU's resilience and effectiveness. Through extensive experiments across different datasets (including CIFAR-10, 100, CelebA, Tiny ImageNet, and ImageNet) and models (including both image classifiers and generative models), we expose critical pros and cons in existing (approximate) unlearning strategies. Our results illuminate the complex challenges of MU in practice, guiding the future development of more accurate and robust unlearning algorithms. The code is available at https://github.com/OPTML-Group/Unlearn-WorstCase.

machine learning, natural language, worst-case forget, (17 more...)

arXiv.org Artificial Intelligence

2403.07362

Country:

North America > United States > Michigan (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Towards Universal Mesh Movement Networks

Zhang, Mingrui, Wang, Chunyang, Kramer, Stephan, Wallwork, Joseph G., Li, Siyi, Liu, Jiancheng, Chen, Xiang, Piggott, Matthew D.

arXiv.org Artificial IntelligenceJul-1-2024

Solving complex Partial Differential Equations (PDEs) accurately and efficiently is an essential and challenging problem in all scientific and engineering disciplines. Mesh movement methods provide the capability to improve the accuracy of the numerical solution without increasing the overall mesh degree of freedom count. Conventional sophisticated mesh movement methods are extremely expensive and struggle to handle scenarios with complex boundary geometries. However, existing learning-based methods require re-training from scratch given a different PDE type or boundary geometry, which limits their applicability, and also often suffer from robustness issues in the form of inverted elements. In this paper, we introduce the Universal Mesh Movement Network (UM2N), which -- once trained -- can be applied in a non-intrusive, zero-shot manner to move meshes with different size distributions and structures, for solvers applicable to different PDE types and boundary geometries. UM2N consists of a Graph Transformer (GT) encoder for extracting features and a Graph Attention Network (GAT) based decoder for moving the mesh. We evaluate our method on advection and Navier-Stokes based examples, as well as a real-world tsunami simulation case. Our method outperforms existing learning-based mesh movement methods in terms of the benchmarks described above. In comparison to the conventional sophisticated Monge-Amp\`ere PDE-solver based method, our approach not only significantly accelerates mesh movement, but also proves effective in scenarios where the conventional method fails. Our project page is at https://erizmr.github.io/UM2N/.

artificial intelligence, machine learning, mesh, (16 more...)

arXiv.org Artificial Intelligence

2407.00382

Country:

Asia > Japan (0.15)
North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.40)

Industry: Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning

Jia, Jinghan, Zhang, Yihua, Zhang, Yimeng, Liu, Jiancheng, Runwal, Bharat, Diffenderfer, James, Kailkhura, Bhavya, Liu, Sijia

arXiv.org Artificial IntelligenceJun-24-2024

Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility beyond the scope of unlearning. While interest in studying LLM unlearning is growing, the impact of the optimizer choice for LLM unlearning remains unexplored. In this work, we shed light on the significance of optimizer selection in LLM unlearning for the first time, establishing a clear connection between second-order optimization and influence unlearning (a classical approach using influence functions to update the model for data influence removal). This insight propels us to develop a second-order optimization-based LLM unlearning framework, termed Second-Order UnLearning (SOUL), which extends the static, one-shot model update using influence unlearning to a dynamic, iterative unlearning process. Our extensive experiments show that SOUL consistently outperforms conventional first-order methods across various unlearning tasks, models, and metrics, indicating that second-order optimization offers an effective and broadly applicable solution for LLM unlearning. Codes are available at https://github.com/OPTML-Group/SOUL.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.18239

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law > Intellectual Property & Technology Law (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Label Smoothing Improves Machine Unlearning

Di, Zonglin, Zhu, Zhaowei, Jia, Jinghan, Liu, Jiancheng, Takhirov, Zafar, Jiang, Bo, Yao, Yuanshun, Liu, Sijia, Liu, Yang

arXiv.org Artificial IntelligenceJun-11-2024

The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of label smoothing. This work introduces UGradSL, a simple, plug-and-play MU approach that uses smoothed labels. We provide theoretical analyses demonstrating why properly introducing label smoothing improves MU performance. We conducted extensive experiments on six datasets of various sizes and different modalities, demonstrating the effectiveness and robustness of our proposed method. The consistent improvement in MU performance is only at a marginal cost of additional computations. For instance, UGradSL improves over the gradient ascent MU baseline by 66% unlearning accuracy without sacrificing unlearning efficiency.

arxiv preprint arxiv, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.07698

Country: North America > United States > California (0.14)

Genre: Research Report (0.81)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

DeepZero: Scaling up Zeroth-Order Optimization for Deep Model Training

Chen, Aochuan, Zhang, Yimeng, Jia, Jinghan, Diffenderfer, James, Liu, Jiancheng, Parasyris, Konstantinos, Zhang, Yihua, Zhang, Zheng, Kailkhura, Bhavya, Liu, Sijia

arXiv.org Artificial IntelligenceFeb-3-2024

Zeroth-order (ZO) optimization has become a popular technique for solving machine learning (ML) problems when first-order (FO) information is difficult or impossible to obtain. However, the scalability of ZO optimization remains an open problem: Its use has primarily been limited to relatively small-scale ML problems, such as sample-wise adversarial attack generation. To our best knowledge, no prior work has demonstrated the effectiveness of ZO optimization in training deep neural networks (DNNs) without a significant decrease in performance. To overcome this roadblock, we develop DeepZero, a principled ZO deep learning (DL) framework that can scale ZO optimization to DNN training from scratch through three primary innovations. First, we demonstrate the advantages of coordinatewise gradient estimation (CGE) over randomized vector-wise gradient estimation in training accuracy and computational efficiency. Second, we propose a sparsityinduced ZO training protocol that extends the model pruning methodology using only finite differences to explore and exploit the sparse DL prior in CGE. Third, we develop the methods of feature reuse and forward parallelization to advance the practical implementations of ZO training. Our extensive experiments show that DeepZero achieves state-of-the-art (SOTA) accuracy on ResNet-20 trained on CIFAR-10, approaching FO training performance for the first time. Furthermore, we show the practical utility of DeepZero in applications of certified adversarial defense and DL-based partial differential equation error correction, achieving 10-20% improvement over SOTA. We believe our results will inspire future research on scalable ZO optimization and contribute to advancing DL with black box. Codes are available at https://github.com/OPTML-Group/DeepZero.

artificial intelligence, machine learning, optimization, (16 more...)

arXiv.org Artificial Intelligence

2310.02025

Country: North America > United States > Virginia (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Energy (0.68)
Education (0.68)
Government (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

Model Sparsity Can Simplify Machine Unlearning

Jia, Jinghan, Liu, Jiancheng, Ram, Parikshit, Yao, Yuguang, Liu, Gaowen, Liu, Yang, Sharma, Pranay, Liu, Sijia

arXiv.org Artificial IntelligenceJan-27-2024

In response to recent data regulation requirements, machine unlearning (MU) has emerged as a critical process to remove the influence of specific examples from a given model. Although exact unlearning can be achieved through complete model retraining using the remaining dataset, the associated computational costs have driven the development of efficient, approximate unlearning techniques. Moving beyond data-centric MU approaches, our study introduces a novel model-based perspective: model sparsification via weight pruning, which is capable of reducing the gap between exact unlearning and approximate unlearning. We show in both theory and practice that model sparsity can boost the multi-criteria unlearning performance of an approximate unlearner, closing the approximation gap, while continuing to be efficient. This leads to a new MU paradigm, termed prune first, then unlearn, which infuses a sparse model prior into the unlearning process. Building on this insight, we also develop a sparsity-aware unlearning method that utilizes sparsity regularization to enhance the training process of approximate unlearning. Extensive experiments show that our proposals consistently benefit MU in various unlearning scenarios. A notable highlight is the 77% unlearning efficacy gain of fine-tuning (one of the simplest unlearning methods) when using sparsity-aware unlearning. Furthermore, we demonstrate the practical impact of our proposed MU methods in addressing other machine learning challenges, such as defending against backdoor attacks and enhancing transfer learning. Codes are available at https://github.com/OPTML-Group/Unlearn-Sparse.

artificial intelligence, machine learning, sparsity, (15 more...)

arXiv.org Artificial Intelligence

2304.04934

Country: North America > United States > California (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning

Zhang, Yihua, Zhang, Yimeng, Chen, Aochuan, Jia, Jinghan, Liu, Jiancheng, Liu, Gaowen, Hong, Mingyi, Chang, Shiyu, Liu, Sijia

arXiv.org Artificial IntelligenceNov-18-2023

Massive data is often considered essential for deep learning applications, but it also incurs significant computational and infrastructural costs. Therefore, dataset pruning (DP) has emerged as an effective way to improve data efficiency by identifying and removing redundant training samples without sacrificing performance. In this work, we aim to address the problem of DP for transfer learning, i.e., how to prune a source dataset for improved pretraining efficiency and lossless finetuning accuracy on downstream target tasks. To our best knowledge, the problem of DP for transfer learning remains open, as previous studies have primarily addressed DP and transfer learning as separate problems. By contrast, we establish a unified viewpoint to integrate DP with transfer learning and find that existing DP methods are not suitable for the transfer learning paradigm. We then propose two new DP methods, label mapping and feature mapping, for supervised and self-supervised pretraining settings respectively, by revisiting the DP problem through the lens of source-target domain mapping. Furthermore, we demonstrate the effectiveness of our approach on numerous transfer learning tasks. We show that source data classes can be pruned by up to 40% ~ 80% without sacrificing downstream performance, resulting in a significant 2 ~ 5 times speed-up during the pretraining stage. Besides, our proposal exhibits broad applicability and can improve other computationally intensive transfer learning techniques, such as adversarial pretraining. Codes are available at https://github.com/OPTML-Group/DP4TL.

artificial intelligence, arxiv preprint arxiv, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2310.08782

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.67)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback