Goto

Collaborating Authors

 approximate leader


Nearly Optimal Algorithms for Private Online Learning in Full information and Bandit Settings

Neural Information Processing Systems

We give differentially private algorithms for a large class of online learning algorithms, in both the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work (due to Dwork, Naor, Pitassi and Rothblum (2010) and Jain, Kothari and Thakurta (2012)). In many cases, our algorithms (in both settings) match the dependence on the input length, T, of the optimal nonprivate regret bounds up to logarithmic factors in T. Our algorithms require logarithmic space and update time.


Differentially Private Online Submodular Optimization

arXiv.org Machine Learning

In this paper we develop the first algorithms for online submodular minimization that preserve differential privacy under full information feedback and bandit feedback. A sequence of $T$ submodular functions over a collection of $n$ elements arrive online, and at each timestep the algorithm must choose a subset of $[n]$ before seeing the function. The algorithm incurs a cost equal to the function evaluated on the chosen set, and seeks to choose a sequence of sets that achieves low expected regret. Our first result is in the full information setting, where the algorithm can observe the entire function after making its decision at each timestep. We give an algorithm in this setting that is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}\sqrt{T}}{\epsilon}\right)$. This algorithm works by relaxing submodular function to a convex function using the Lovasz extension, and then simulating an algorithm for differentially private online convex optimization. Our second result is in the bandit setting, where the algorithm can only see the cost incurred by its chosen set, and does not have access to the entire function. This setting is significantly more challenging because the algorithm does not receive enough information to compute the Lovasz extension or its subgradients. Instead, we construct an unbiased estimate using a single-point estimation, and then simulate private online convex optimization using this estimate. Our algorithm using bandit feedback is $\epsilon$-differentially private and achieves expected regret $\tilde{O}\left(\frac{n^{3/2}T^{3/4}}{\epsilon}\right)$.


(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings

Neural Information Processing Systems

We give differentially private algorithms for a large class of online learning algorithms, inboth the full information and bandit settings. Our algorithms aim to minimize a convex loss function which is a sum of smaller convex loss terms, one for each data point. To design our algorithms, we modify the popular mirror descent approach, or rather a variant called follow the approximate leader. The technique leads to the first nonprivate algorithms for private online learning in the bandit setting. In the full information setting, our algorithms improve over the regret bounds of previous work (due to Dwork, Naor, Pitassi and Rothblum (2010) and Jain, Kothari and Thakurta (2012)). In many cases, our algorithms (in both settings) match the dependence on the input length, T, of the optimal nonprivate regret bounds up to logarithmic factors in T . Our algorithms require logarithmic space and update time.