Sample-Optimal Locally Private Hypothesis Selection and the Provable Benefits of Interactivity

Pour, Alireza F., Ashtiani, Hassan, Asoodeh, Shahab

Dec-9-2023–arXiv.org Machine Learning

We study the problem of hypothesis selection under the constraint of local differential privacy. Given a class $\mathcal{F}$ of $k$ distributions and a set of i.i.d. samples from an unknown distribution $h$, the goal of hypothesis selection is to pick a distribution $\hat{f}$ whose total variation distance to $h$ is comparable with the best distribution in $\mathcal{F}$ (with high probability). We devise an $\varepsilon$-locally-differentially-private ($\varepsilon$-LDP) algorithm that uses $\Theta\left(\frac{k}{\alpha^2\min \{\varepsilon^2,1\}}\right)$ samples to guarantee that $d_{TV}(h,\hat{f})\leq \alpha + 9 \min_{f\in \mathcal{F}}d_{TV}(h,f)$ with high probability. This sample complexity is optimal for $\varepsilon<1$, matching the lower bound of Gopi et al. (2020). All previously known algorithms for this problem required $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ samples to work. Moreover, our result demonstrates the power of interaction for $\varepsilon$-LDP hypothesis selection. Namely, it breaks the known lower bound of $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ for the sample complexity of non-interactive hypothesis selection. Our algorithm breaks this barrier using only $\Theta(\log \log k)$ rounds of interaction. To prove our results, we define the notion of \emph{critical queries} for a Statistical Query Algorithm (SQA) which may be of independent interest. Informally, an SQA is said to use a small number of critical queries if its success relies on the accuracy of only a small number of queries it asks. We then design an LDP algorithm that uses a smaller number of critical queries.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

Dec-9-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.14)
- North America > United States (0.14)

Genre:
- Research Report > New Finding (0.88)

Industry:
- Information Technology > Security & Privacy (0.92)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science (0.92)
  - Security & Privacy (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found