On the Inherent Privacy of Zeroth Order Projected Gradient Descent

Gupta, Devansh, Razaviyayn, Meisam, Sharan, Vatsal

arXiv.org Machine Learning 

The fine-tuning of pretrained large language models (LLMs) has demonstrated state-of-the-art performance across a range of downstream applications. However, two main challenges hinder the wide adoption of these models: the substantial memory requirements of gradient-based optimizers used for fine-tuning and the critical need to protect the privacy of domain-specific fine-tuning data. As fine-tuning LLMs grows increasingly memory-intensive, a range of strategies has emerged to address this issue. In particular, zeroth-order (ZO) optimization methods recently have gained traction due to their memory efficiency, as they do not require explicit gradient computations. Instead, the zeroth-order gradients can be computed using forward step only, significantly reducing memory use compared to gradient computation. In a pioneering approach, Malladi et al. (2023) introduced a memory-efficient technique for fine-tuning LLMs using two-point Simultaneous Perturbation Stochastic Approximation (SPSA) estimators (Spall, 1992), enabling large model fine-tuning on memory-limited devices. Since then, zeroth-order methods have gained popularity in dealing with large machine learning models due to their memory efficiency and favorable upper bounds on gap from optimality under certain conditions on the Hessian of the objective function (Zhang et al., 2024b,a; Guo et al., 2024). Another major concern in training LLMs is privacy . As large parameterized models are increasingly used in sensitive data applications, these models must protect sensitive information, especially given privacy regulations like the E.U.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found