Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Neural Information Processing Systems 

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a k {\text{th}} -moment bound on the Lipschitz constants of sample functions, rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G_2 \cdot \frac 1 {\sqrt n} G_k \cdot (\frac{\sqrt d}{n\epsilon}) {1 - \frac 1 k} under (\epsilon, \delta) -approximate differential privacy, up to a mild \textup{polylog}(\frac{1}{\delta}) factor, where G_2 2 and G_k k are the 2 {\text{nd}} and k {\text{th}} moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [LR23].We then give a suite of private algorithms for DP-SCO with heavy-tailed gradients improving our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models.