AITopics | Hu, Shengyuan

Collaborating Authors

Hu, Shengyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Thaker, Pratiksha, Hu, Shengyuan, Kale, Neil, Maurya, Yash, Wu, Zhiwei Steven, Smith, Virginia

arXiv.org Artificial IntelligenceOct-3-2024

Unlearning methods have the potential to improve the privacy and safety of large language models (LLMs) by removing sensitive or harmful information post hoc. The LLM unlearning research community has increasingly turned toward empirical benchmarks to assess the effectiveness of such methods. In this paper, we find that existing benchmarks provide an overly optimistic and potentially misleading view on the effectiveness of candidate unlearning methods. By introducing simple, benign modifications to a number of popular benchmarks, we expose instances where supposedly unlearned information remains accessible, or where the unlearning process has degraded the model's performance on retained information to a much greater extent than indicated by the original benchmark. We identify that existing benchmarks are particularly vulnerable to modifications that introduce even loose dependencies between the forget and retain information. Further, we show that ambiguity in unlearning targets in existing benchmarks can easily lead to the design of methods that overfit to the given test queries. Based on our findings, we urge the community to be cautious when interpreting benchmark results as reliable measures of progress, and we provide several recommendations to guide future LLM unlearning research.

benchmark, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.02879

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Hu, Shengyuan, Fu, Yiwei, Wu, Zhiwei Steven, Smith, Virginia

arXiv.org Artificial IntelligenceJun-19-2024

Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to reverse the effects of unlearning. We formalize this unlearning-relearning pipeline, explore the attack across three popular unlearning benchmarks, and discuss future directions and guidelines that result from our study.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2406.13356

Country:

North America > United States (0.28)
Asia (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.35)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Guardrail Baselines for Unlearning in LLMs

Thaker, Pratiksha, Maurya, Yash, Hu, Shengyuan, Wu, Zhiwei Steven, Smith, Virginia

arXiv.org Artificial IntelligenceJun-11-2024

Recent years have seen two trends emerge simultaneously: large language models (LLMs) trained on increasing amounts of user data (generally scraped indiscriminately from the web), in parallel with increasing legal protections on digital data use including data revocation ("right to be forgotten") laws. In order to support data revocation for models that have already been trained on potentially sensitive data, a number of works have proposed approaches for data "unlearning" (Bourtoule et al., 2021; Gupta et al., 2021; Ginart et al., 2019), which aims to remove the influence of specific subsets of training data without entirely retraining a model. Unlearning in LLMs is particularly challenging because individuals' information may not be contained to specific data points (Brown et al., 2022; Tramèr et al., 2022). Nevertheless, recent work has shown that model finetuning is a promising approach to forget, for example, information corresponding to the book series Harry Potter (Eldan and Russinovich, 2023); information about specific individuals in a synthetic dataset (Maini et al., 2024); or knowledge that could give information to malicious agents Li et al. (2024). While finetuning is a promising approach, a number of recent works have shown that simple modifications to the input prompt or output postprocessing filters (which we collectively call "guardrails") can also be effective for generating a desirable output distribution from a model (Pawelczyk et al., 2023; Brown et al., 2020; Chowdhery et al., 2023; Wei et al., 2021; Kim et al., 2024). Prompt prefixes and postprocessing filters do not update the model weights, so the resulting model itself would not satisfy definitions of unlearning that require the distribution of model weights to match a model retrained from scratch Bourtoule et al. (2021). However, in practical settings where users can only access the model through an API, modifying the output distribution alone can suffice. In fact, most existing unlearning benchmarks (Eldan and Russinovich, 2023; Maini et al., 2024; unl, 2023; Li et al., 2024) only examine the model outputs when evaluating unlearning, which is consistent with a threat model in which users have only API access (see Section 3). In this paper, we investigate how existing benchmarks fare under guardrail-based approaches, and show that in three popular unlearning benchmarks, guardrails not only give strong performance comparable to finetuning baselines, but can also surface weaknesses or inconsistencies in the benchmarks or metrics themselves.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.03329

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)
Government (0.93)
Law (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices

Pang, Qi, Hu, Shengyuan, Zheng, Wenting, Smith, Virginia

arXiv.org Artificial IntelligenceMay-25-2024

Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating the misuse of such AI-generated content. However, we show that common design choices in LLM watermarking schemes make the resulting systems surprisingly susceptible to attack -- leading to fundamental trade-offs in robustness, utility, and usability. To navigate these trade-offs, we rigorously study a set of simple yet effective attacks on common watermarking systems, and propose guidelines and defenses for LLM watermarking in practice.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.16187

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Sports > Olympic Games (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Privacy Amplification for the Gaussian Mechanism via Bounded Support

Hu, Shengyuan, Mahloujifar, Saeed, Smith, Virginia, Chaudhuri, Kamalika, Guo, Chuan

arXiv.org Artificial IntelligenceMar-7-2024

Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. These guarantees can be desirable compared to vanilla DP in real world settings as they tightly upper-bound the privacy leakage for a $\textit{specific}$ individual in an $\textit{actual}$ dataset, rather than considering worst-case datasets. While these frameworks are beginning to gain popularity, to date, there is a lack of private mechanisms that can fully leverage advantages of data-dependent accounting. To bridge this gap, we propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting. Experiments on model training with DP-SGD show that using bounded support Gaussian mechanisms can provide a reduction of the pDP bound $\epsilon$ by as much as 30% without negative effects on model utility.

artificial intelligence, gaussian mechanism, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.05598

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Private Multi-Task Learning: Formulation and Applications to Federated Learning

Hu, Shengyuan, Wu, Zhiwei Steven, Smith, Virginia

arXiv.org Artificial IntelligenceOct-17-2023

Many problems in machine learning rely on multi-task learning (MTL), in which the goal is to solve multiple related machine learning tasks simultaneously. MTL is particularly relevant for privacy-sensitive applications in areas such as healthcare, finance, and IoT computing, where sensitive data from multiple, varied sources are shared for the purpose of learning. In this work, we formalize notions of client-level privacy for MTL via joint differential privacy (JDP), a relaxation of differential privacy for mechanism design and distributed optimization. We then propose an algorithm for mean-regularized MTL, an objective commonly used for applications in personalized federated learning, subject to JDP. We analyze our objective and solver, providing certifiable guarantees on both privacy and utility. Empirically, we find that our method provides improved privacy/utility trade-offs relative to global baselines across common federated learning benchmarks.

artificial intelligence, learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2108.12978

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.34)

Add feedback

Federated Learning as a Network Effects Game

Hu, Shengyuan, Ngo, Dung Daniel, Zheng, Shuran, Smith, Virginia, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceFeb-16-2023

Federated Learning (FL) aims to foster collaboration among a population of clients to improve the accuracy of machine learning without directly sharing local data. Although there has been rich literature on designing federated learning algorithms, most prior works implicitly assume that all clients are willing to participate in a FL scheme. In practice, clients may not benefit from joining in FL, especially in light of potential costs related to issues such as privacy and computation. In this work, we study the clients' incentives in federated learning to help the service provider design better solutions and ensure clients make better decisions. We are the first to model clients' behaviors in FL as a network effects game, where each client's benefit depends on other clients who also join the network. Using this setup we analyze the dynamics of clients' participation and characterize the equilibrium, where no client has incentives to alter their decision. Specifically, we show that dynamics in the population naturally converge to equilibrium without needing explicit interventions. Finally, we provide a cost-efficient payment scheme that incentivizes clients to reach a desired equilibrium when the initial network is empty.

artificial intelligence, coalition, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.08533

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Federated Multi-Task Learning for Competing Constraints

Li, Tian, Hu, Shengyuan, Beirami, Ahmad, Smith, Virginia

arXiv.org Machine LearningDec-8-2020

In addition to accuracy, fairness and robustness are two critical concerns for federated learning systems. In this work, we first identify that robustness to adversarial training-time attacks and fairness, measured as the uniformity of performance across devices, are competing constraints in statistically heterogeneous networks. To address these constraints, we propose employing a simple, general multi-task learning objective, and analyze the ability of the objective to achieve a favorable tradeoff between fairness and robustness. We develop a scalable solver for the objective and show that multi-task learning can enable more accurate, robust, and fair models relative to state-of-the-art baselines across a suite of federated datasets.

artificial intelligence, learning, neural network, (16 more...)

arXiv.org Machine Learning

2012.04221

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback