Optimized Tradeoffs for Private Prediction with Majority Ensembling

Jiang, Shuli, Qiuyi, null, Zhang, null, Joshi, Gauri

arXiv.org Artificial Intelligence 

We study a classical problem in private prediction, the problem of computing an (mϵ, δ)- differentially private majority of K (ϵ,)-differentially private algorithms for 1 m K and 1 > δ 0. Standard methods such as subsampling or randomized response are widely used, but do they provide optimal privacy-utility tradeoffs? To answer this, we introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. It is parameterized by a data-dependent noise function γ, and enables efficient utility optimization over the class of all private algorithms, encompassing those standard methods. We show that maximizing the utility of an (mϵ, δ)-private majority algorithm can be computed tractably through an optimization problem for any m K by a novel structural result that reduces the infinitely many privacy constraints into a polynomial set. In some settings, we show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility. Lastly, we demonstrate the strong empirical effectiveness of our first-of-its-kind privacy-constrained utility optimization for ensembling labels for private prediction from private teachers in image classification. Notably, our DaRRM framework with an optimized γ exhibits substantial utility gains when compared against several baselines.