trade-off function
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > United States (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > France (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
- Information Technology > Security & Privacy (1.00)
- Law (0.68)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (17 more...)
Unified Enhancement of Privacy Bounds for Mixture Mechanisms via f -Differential Privacy
Differentially private (DP) machine learning algorithms incur many sources of randomness, such as random initialization, random batch subsampling, and shuffling. However, such randomness is difficult to take into account when proving differential privacy bounds because it induces mixture distributions for the algorithm's output that are difficult to analyze. This paper focuses on improving privacy bounds for shuffling models and one-iteration differentially private gradient descent (DP-GD) with random initializations using $f$-DP. We derive a closed-form expression of the trade-off function for shuffling models that outperforms the most up-to-date results based on $(\epsilon,\delta)$-DP.Moreover, we investigate the effects of random initialization on the privacy of one-iteration DP-GD. Our numerical computations of the trade-off function indicate that random initialization can enhance the privacy of DP-GD.Our analysis of $f$-DP guarantees for these mixture mechanisms relies on an inequality for trade-off functions introduced in this paper. This inequality implies the joint convexity of $F$-divergences. Finally, we study an $f$-DP analog of the advanced joint convexity of the hockey-stick divergence related to $(\epsilon,\delta)$-DP and apply it to analyze the privacy of mixture mechanisms.
Infinitely divisible privacy and beyond I: resolution of the $s^2=2k$ conjecture
Pandey, Aaradhya, Maleki, Arian, Kulkarni, Sanjeev
Differential privacy is increasingly formalized through the lens of hypothesis testing via the robust and interpretable $f$-DP framework, where privacy guarantees are encoded by a baseline Blackwell trade-off function $f_{\infty} = T(P_{\infty}, Q_{\infty})$ involving a pair of distributions $(P_{\infty}, Q_{\infty})$. The problem of choosing the right privacy metric in practice leads to a central question: what is a statistically appropriate baseline $f_{\infty}$ given some prior modeling assumptions? The special case of Gaussian differential privacy (GDP) showed that, under compositions of nearly perfect mechanisms, these trade-off functions exhibit a central limit behavior with a Gaussian limit experiment. Inspired by Le Cam's theory of limits of statistical experiments, we answer this question in full generality in an infinitely divisible setting. We show that suitable composition experiments $(P_n^{\otimes n}, Q_n^{\otimes n})$ converge to a binary limit experiment $(P_{\infty}, Q_{\infty})$ whose log-likelihood ratio $L = \log(dQ_{\infty} / dP_{\infty})$ is infinitely divisible under $P_{\infty}$. Thus any limiting trade-off function $f_{\infty}$ is determined by an infinitely divisible law $P_{\infty}$, characterized by its Levy--Khintchine triplet, and its Esscher tilt defined by $dQ_{\infty}(x) = e^{x} dP_{\infty}(x)$. This characterizes all limiting baseline trade-off functions $f_{\infty}$ arising from compositions of nearly perfect differentially private mechanisms. Our framework recovers GDP as the purely Gaussian case and yields explicit non-Gaussian limits, including Poisson examples. It also positively resolves the empirical $s^2 = 2k$ phenomenon observed in the GDP paper and provides an optimal mechanism for count statistics achieving asymmetric Poisson differential privacy.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Massachusetts > Plymouth County > Hanover (0.04)
- (2 more...)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
High-Dimensional Asymptotics of Differentially Private PCA
Yun, Youngjoo, Dudeja, Rishabh
In differential privacy, statistics of a sensitive dataset are privatized by introducing random noise. Most privacy analyses provide privacy bounds specifying a noise level sufficient to achieve a target privacy guarantee. Sometimes, these bounds are pessimistic and suggest adding excessive noise, which overwhelms the meaningful signal. It remains unclear if such high noise levels are truly necessary or a limitation of the proof techniques. This paper explores whether we can obtain sharp privacy characterizations that identify the smallest noise level required to achieve a target privacy level for a given mechanism. We study this problem in the context of differentially private principal component analysis, where the goal is to privatize the leading principal components (PCs) of a dataset with n samples and p features. We analyze the exponential mechanism for this problem in a model-free setting and provide sharp utility and privacy characterizations in the high-dimensional limit ($p\rightarrow\infty$). Our privacy result shows that, in high dimensions, detecting the presence of a target individual in the dataset using the privatized PCs is exactly as hard as distinguishing two Gaussians with slightly different means, where the mean difference depends on certain spectral properties of the dataset. Our privacy analysis combines the hypothesis-testing formulation of privacy guarantees proposed by Dong, Roth, and Su (2022) with classical contiguity arguments due to Le Cam to obtain sharp high-dimensional privacy characterizations.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Mitigating Privacy-Utility Trade-off in Decentralized Federated Learning via $f$-Differential Privacy
Li, Xiang, Su, Buxin, Wang, Chendi, Long, Qi, Su, Weijie J.
Differentially private (DP) decentralized Federated Learning (FL) allows local users to collaborate without sharing their data with a central server. However, accurately quantifying the privacy budget of private FL algorithms is challenging due to the co-existence of complex algorithmic components such as decentralized communication and local updates. This paper addresses privacy accounting for two decentralized FL algorithms within the $f$-differential privacy ($f$-DP) framework. We develop two new $f$-DP-based accounting methods tailored to decentralized settings: Pairwise Network $f$-DP (PN-$f$-DP), which quantifies privacy leakage between user pairs under random-walk communication, and Secret-based $f$-Local DP (Sec-$f$-LDP), which supports structured noise injection via shared secrets. By combining tools from $f$-DP theory and Markov chain concentration, our accounting framework captures privacy amplification arising from sparse communication, local iterations, and correlated noise. Experiments on synthetic and real datasets demonstrate that our methods yield consistently tighter $(ε,δ)$ bounds and improved utility compared to Rényi DP-based approaches, illustrating the benefits of $f$-DP in decentralized privacy accounting.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (19 more...)
f-INE: A Hypothesis Testing Framework for Estimating Influence under Training Randomness
Panda, Subhodip, Tarsadiya, Dhruv, Sourav, Shashwat, P, Prathosh A., Karimireddy, Sai Praneeth
Influence estimation methods promise to explain and debug machine learning by estimating the impact of individual samples on the final model. Yet, existing methods collapse under training randomness: the same example may appear critical in one run and irrelevant in the next. Such instability undermines their use in data curation or cleanup since it is unclear if we indeed deleted/kept the correct datapoints. To overcome this, we introduce *f-influence* -- a new influence estimation framework grounded in hypothesis testing that explicitly accounts for training randomness, and establish desirable properties that make it suitable for reliable influence estimation. We also design a highly efficient algorithm **f**-**IN**fluence **E**stimation (**f-INE**) that computes f-influence **in a single training run**. Finally, we scale up f-INE to estimate influence of instruction tuning data on Llama-3.1-8B and show it can reliably detect poisoned samples that steer model opinions, demonstrating its utility for data cleanup and attributing model behavior.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > Middle East > Jordan (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)