AITopics

2411.08791

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.75)

arXiv.org Artificial IntelligenceOct-30-2024

Exactly Minimax-Optimal Locally Differentially Private Sampling

Park, Hyun-Young, Asoodeh, Shahab, Lee, Si-Hyeon

The sampling problem under local differential privacy has recently been studied with potential applications to generative models, but a fundamental analysis of its privacy-utility trade-off (PUT) remains incomplete. In this work, we define the fundamental PUT of private sampling in the minimax sense, using the f-divergence between original and sampling distributions as the utility measure. We characterize the exact PUT for both finite and continuous data spaces under some mild conditions on the data distributions, and propose sampling mechanisms that are universally optimal for all f-divergences. Our numerical experiments demonstrate the superiority of our mechanisms over baselines, in terms of theoretical utilities for finite data space and of empirical utilities for continuous data space.

artificial intelligence, machine learning, mechanism, (13 more...)

2410.22699

Country:

North America > United States > New York (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.49)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)

arXiv.org Machine LearningFeb-23-2024

Differentially Private Fair Binary Classifications

Ghoukasian, Hrad, Asoodeh, Shahab

In this work, we investigate binary classification under the constraints of both differential privacy and fairness. We first propose an algorithm based on the decoupling technique for learning a classifier with only fairness guarantee. This algorithm takes in classifiers trained on different demographic groups and generates a single classifier satisfying statistical parity. We then refine this algorithm to incorporate differential privacy. The performance of the final algorithm is rigorously examined in terms of privacy, fairness, and utility guarantees. Empirical evaluations conducted on the Adult and Credit Card datasets illustrate that our algorithm outperforms the state-of-the-art in terms of fairness guarantees, while maintaining the same level of privacy and utility.

artificial intelligence, differentially private fair binary classification, machine learning

2402.15603

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

arXiv.org Machine LearningDec-9-2023

Sample-Optimal Locally Private Hypothesis Selection and the Provable Benefits of Interactivity

Pour, Alireza F., Ashtiani, Hassan, Asoodeh, Shahab

We study the problem of hypothesis selection under the constraint of local differential privacy. Given a class $\mathcal{F}$ of $k$ distributions and a set of i.i.d. samples from an unknown distribution $h$, the goal of hypothesis selection is to pick a distribution $\hat{f}$ whose total variation distance to $h$ is comparable with the best distribution in $\mathcal{F}$ (with high probability). We devise an $\varepsilon$-locally-differentially-private ($\varepsilon$-LDP) algorithm that uses $\Theta\left(\frac{k}{\alpha^2\min \{\varepsilon^2,1\}}\right)$ samples to guarantee that $d_{TV}(h,\hat{f})\leq \alpha + 9 \min_{f\in \mathcal{F}}d_{TV}(h,f)$ with high probability. This sample complexity is optimal for $\varepsilon<1$, matching the lower bound of Gopi et al. (2020). All previously known algorithms for this problem required $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ samples to work. Moreover, our result demonstrates the power of interaction for $\varepsilon$-LDP hypothesis selection. Namely, it breaks the known lower bound of $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ for the sample complexity of non-interactive hypothesis selection. Our algorithm breaks this barrier using only $\Theta(\log \log k)$ rounds of interaction. To prove our results, we define the notion of \emph{critical queries} for a Statistical Query Algorithm (SQA) which may be of independent interest. Informally, an SQA is said to use a small number of critical queries if its success relies on the accuracy of only a small number of queries it asks. We then design an LDP algorithm that uses a smaller number of critical queries.

algorithm, artificial intelligence, machine learning, (16 more...)

2312.05645

Country:

North America > United States (0.14)
Europe > Russia (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.92)
Information Technology > Data Science (0.92)

arXiv.org Artificial IntelligenceMay-16-2023

Privacy Loss of Noisy Stochastic Gradient Descent Might Converge Even for Non-Convex Losses

Asoodeh, Shahab, Diaz, Mario

The Noisy-SGD algorithm is widely used for privately training machine learning models. Traditional privacy analyses of this algorithm assume that the internal state is publicly revealed, resulting in privacy loss bounds that increase indefinitely with the number of iterations. However, recent findings have shown that if the internal state remains hidden, then the privacy loss might remain bounded. Nevertheless, this remarkable result heavily relies on the assumption of (strong) convexity of the loss function. It remains an important open problem to further relax this condition while proving similar convergent upper bounds on the privacy loss. In this work, we address this problem for DP-SGD, a popular variant of Noisy-SGD that incorporates gradient clipping to limit the impact of individual samples on the training process. Our findings demonstrate that the privacy loss of projected DP-SGD converges exponentially fast, without requiring convexity or smoothness assumptions on the loss function. In addition, we analyze the privacy loss of regularized (unprojected) DP-SGD. To obtain these results, we directly analyze the hockey-stick divergence between coupled stochastic processes by relying on non-linear data processing inequalities.

artificial intelligence, machine learning, privacy loss, (15 more...)

2305.09903

Country:

North America > Mexico (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.84)

arXiv.org Machine LearningFeb-14-2023

Contraction of Locally Differentially Private Mechanisms

Asoodeh, Shahab, Zhang, Huanyu

We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between $P\mathsf{K}$ and $Q\mathsf{K}$ output distributions of an $\varepsilon$-LDP mechanism $\mathsf{K}$ in terms of a divergence between the corresponding input distributions $P$ and $Q$, respectively. Our first main technical result presents a sharp upper bound on the $\chi^2$-divergence $\chi^2(P\mathsf{K}\|Q\mathsf{K})$ in terms of $\chi^2(P\|Q)$ and $\varepsilon$. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on $\chi^2(P\mathsf{K}\|Q\mathsf{K})$ in terms of total variation distance $\mathsf{TV}(P, Q)$ and $\varepsilon$. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam's, Assouad's, and the mutual information methods, which are powerful tools for bounding minimax estimation risks. These results are shown to lead to better privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.

artificial intelligence, machine learning, mechanism, (16 more...)

2210.13386

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)
North America > Canada > Ontario > Hamilton (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

arXiv.org Machine LearningFeb-1-2021

Local Differential Privacy Is Equivalent to Contraction of $E_\gamma$-Divergence

Asoodeh, Shahab, Aliakbarpour, Maryam, Calmon, Flavio P.

We investigate the local differential privacy (LDP) guarantees of a randomized privacy mechanism via its contraction properties. We first show that LDP constraints can be equivalently cast in terms of the contraction coefficient of the $E_\gamma$-divergence. We then use this equivalent formula to express LDP guarantees of privacy mechanisms in terms of contraction coefficients of arbitrary $f$-divergences. When combined with standard estimation-theoretic tools (such as Le Cam's and Fano's converse methods), this result allows us to study the trade-off between privacy and utility in several testing and minimax and Bayesian estimation problems.

artificial intelligence, bayesian inference, mechanism, (17 more...)

2102.01258

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

arXiv.org Machine LearningDec-20-2020

Privacy Analysis of Online Learning Algorithms via Contraction Coefficients

Asoodeh, Shahab, Diaz, Mario, Calmon, Flavio P.

We propose an information-theoretic technique for analyzing privacy guarantees of online algorithms. Specifically, we demonstrate that differential privacy guarantees of iterative algorithms can be determined by a direct application of contraction coefficients derived from strong data processing inequalities for $f$-divergences. Our technique relies on generalizing the Dobrushin's contraction coefficient for total variation distance to an $f$-divergence known as $E_\gamma$-divergence. $E_\gamma$-divergence, in turn, is equivalent to approximate differential privacy. As an example, we apply our technique to derive the differential privacy parameters of gradient descent. Moreover, we also show that this framework can be tailored to batch learning algorithms that can be implemented with one pass over the training dataset.

algorithm, computer based training, educational technology, (20 more...)

2012.11035

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.42)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

arXiv.org Artificial IntelligenceAug-14-2020

Three Variants of Differential Privacy: Lossless Conversion and Applications

Asoodeh, Shahab, Liao, Jiachun, Calmon, Flavio P., Kosut, Oliver, Sankar, Lalitha

We consider three different variants of differential privacy (DP), namely approximate DP, R\'enyi DP (RDP), and hypothesis test DP. In the first part, we develop a machinery for optimally relating approximate DP to RDP based on the joint range of two $f$-divergences that underlie the approximate DP and RDP. In particular, this enables us to derive the optimal approximate DP parameters of a mechanism that satisfies a given level of RDP. As an application, we apply our result to the moments accountant framework for characterizing privacy guarantees of noisy stochastic gradient descent (SGD). When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget. In the second part, we establish a relationship between RDP and hypothesis test DP which allows us to translate the RDP constraint into a tradeoff between type I and type II error probabilities of a certain binary hypothesis test. We then demonstrate that for noisy SGD our result leads to tighter privacy guarantees compared to the recently proposed $f$-DP framework for some range of parameters.

deep learning, mechanism, neural network, (18 more...)

2008.06529

Country:

North America > United States (0.93)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.54)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningJan-16-2020

A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$-Divergences

Asoodeh, Shahab, Liao, Jiachun, Calmon, Flavio P., Kosut, Oliver, Sankar, Lalitha

We derive the optimal differential privacy (DP) parameters of a mechanism that satisfies a given level of R enyi differential privacy (RDP). Our result is based on the joint range of two f -divergences that underlie the approximate and the R enyi variations of differential privacy. We apply our result to the moments accountant framework for characterizing privacy guarantees of stochastic gradient descent. When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget. Differential privacy (DP) [1] has become the de facto standard for privacy-preserving data analytics. Intuitively, a (potentially randomized) algorithm is said to be differentially private if its output does not vary significantly with small perturbations of the input. DP guarantees are usually cast in terms of properties of the information density [2] of the output of the algorithm conditioned on a given input--referred to as the privacy loss variable in the DP literature.

artificial intelligence, neural network, null, (18 more...)

2001.0599

Country: North America > United States (0.28)

Genre: Research Report (0.91)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)