Javidbakht, Omid
Samplable Anonymous Aggregation for Private Federated Data Analysis
Talwar, Kunal, Wang, Shan, McMillan, Audra, Jina, Vojta, Feldman, Vitaly, Basile, Bailey, Cahill, Aine, Chan, Yi Sheng, Chatzidakis, Mike, Chen, Junye, Chick, Oliver, Chitnis, Mona, Ganta, Suman, Goren, Yusuf, Granqvist, Filip, Guo, Kristine, Jacobs, Frederic, Javidbakht, Omid, Liu, Albert, Low, Richard, Mascenik, Dan, Myers, Steve, Park, David, Park, Wonhee, Parsa, Gianni, Pauly, Tommy, Priebe, Christian, Rishi, Rehan, Rothblum, Guy, Scaria, Michael, Song, Linmao, Song, Congzheng, Tarbe, Karl, Vogt, Sebastian, Winstrom, Luke, Zhou, Shundong
Learning aggregate population trends can allow for better data-driven decisions, and application of machine learning can improve user experience. Compared to learning from public curated datasets, learning from a larger population offers several benefits. As an example, a next-word prediction model trained on words typed by users (a) can better fit the actual distribution of language used on devices, (b) can adapt faster to shifts in distribution, and (c) can more faithfully represent smaller sub-populations that may not be well-represented in curated datasets. At the same time, training such models may involve sensitive user data.
Differentially Private Heavy Hitter Detection using Federated Analytics
Chadha, Karan, Chen, Junye, Duchi, John, Feldman, Vitaly, Hashemi, Hanieh, Javidbakht, Omid, McMillan, Audra, Talwar, Kunal
In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning algorithm that improves the performance of the algorithm while satisfying computational, communication and privacy constraints. We explore the impact of different data-selection schemes as well as the impact of introducing deny lists during multiple runs of the algorithm. We test these improvements using extensive experimentation on the Reddit dataset~\cite{caldas2018leaf} on the task of learning the most frequent words.
Private Adaptive Gradient Methods for Convex Optimization
Asi, Hilal, Duchi, John, Fallah, Alireza, Javidbakht, Omid, Talwar, Kunal
While the success of stochastic gradient methods for solving empirical risk minimization has motivated their adoption across much of machine learning, increasing privacy risks in data-intensive tasks have made applying them more challenging [DMNS06]: gradients can leak users' data, intermediate models can compromise individuals, and even final trained models may be non-private without substantial care. This motivates a growing line of work developing private variants of stochastic gradient descent (SGD), where algorithms guarantee differential privacy by perturbing individual gradients with random noise [DJW13; ST13b; ACGMMTZ16; DJW18; BFTT19; FKT20]. Yet these noise addition procedures typically fail to reflect the geometry underlying the optimization problem, which in non-private cases is essential: for high-dimensional problems with sparse parameters, mirror descent and its variants [BT03; NJLS09] are essential, while in the large-scale stochastic settings prevalent in deep learning, AdaGrad and other adaptive variants [DHS11] provide stronger theoretical and practical performance. Even more, methods that do not adapt (or do not leverage geometry) can be provably sub-optimal, in that there exist problems where their convergence is much slower than adaptive variants that reflect appropriate geometry [LD19]. To address these challenges, we introduce Pagan (Private AdaGrad with Adaptive Noise), a new differentially private variant of stochastic gradient descent and AdaGrad.