AITopics | privacy auditing

Collaborating Authors

privacy auditing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Privacy Auditing with One (1) Training Run

Neural Information Processing SystemsDec-26-2025, 10:10:10 GMT

We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently. We analyze this using the connection between differential privacy and statistical generalization, which avoids the cost of group privacy. Our auditing scheme requires minimal assumptions about the algorithm and can be applied in the black-box or white-box setting. We demonstrate the effectiveness of our framework by applying it to DP-SGD, where we can achieve meaningful empirical privacy lower bounds by training only one model. In contrast, standard methods would require training hundreds of models.

name change, privacy auditing, training run, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.62)

Add feedback

PrivacyGuard: A Modular Framework for Privacy Auditing in Machine Learning

Melis, Luca, Grange, Matthew, Kalemaj, Iden, Chadha, Karan, Hu, Shengyuan, Kashtelyan, Elena, Bullock, Will

arXiv.org Artificial IntelligenceOct-28-2025

The increasing deployment of Machine Learning (ML) models in sensitive domains motivates the need for robust, practical privacy assessment tools. PrivacyGuard is a comprehensive tool for empirical differential privacy (DP) analysis, designed to evaluate privacy risks in ML models through state-of-the-art inference attacks and advanced privacy measurement techniques. To this end, PrivacyGuard implements a diverse suite of privacy attack -- including membership inference , extraction, and reconstruction attacks -- enabling both off-the-shelf and highly configurable privacy analyses. Its modular architecture allows for the seamless integration of new attacks, and privacy metrics, supporting rapid adaptation to emerging research advances. We make PrivacyGuard available at https://github.com/facebookresearch/PrivacyGuard.

artificial intelligence, machine learning, privacyguard, (16 more...)

arXiv.org Artificial Intelligence

2510.23427

Country: North America > United States (0.68)

Genre: Research Report (0.83)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Optimizing Canaries for Privacy Auditing with Metagradient Descent

Boglioni, Matteo, Liu, Terrance, Ilyas, Andrew, Wu, Zhiwei Steven

arXiv.org Artificial IntelligenceJul-22-2025

In this work we study black-box privacy auditing, where the goal is to lower bound the privacy parameter of a differentially private learning algorithm using only the algorithm's outputs (i.e., final trained model). For DP-SGD (the most successful method for training differentially private deep learning models), the canonical approach auditing uses membership inference-an auditor comes with a small set of special "canary" examples, inserts a random subset of them into the training set, and then tries to discern which of their canaries were included in the training set (typically via a membership inference attack). The auditor's success rate then provides a lower bound on the privacy parameters of the learning algorithm. Our main contribution is a method for optimizing the auditor's canary set to improve privacy auditing, leveraging recent work on metagradient optimization. Our empirical evaluation demonstrates that by using such optimized canaries, we can improve empirical lower bounds for differentially private image classification models by over 2x in certain instances. Furthermore, we demonstrate that our method is transferable and efficient: canaries optimized for non-private SGD with a small model architecture remain effective when auditing larger models trained with DP-SGD.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.15836

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining

Neural Information Processing SystemsMay-27-2025, 04:01:56 GMT

We present PANORAMIA, a privacy leakage measurement framework for machine learning models that relies on membership inference attacks using generated data as non-members. By relying on generated non-member data, PANORAMIA eliminates the common dependency of privacy measurement tools on in-distribution non-member data. As a result, PANORAMIA does not modify the model, training data, or training process, and only requires access to a subset of the training data. We evaluate PANORAMIA on ML models for image and tabular data classification, as well as on large-scale language models.

artificial intelligence, machine learning, panoramia, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Privacy Auditing with One (1) Training Run

Neural Information Processing SystemsJan-19-2025, 16:45:17 GMT

privacy auditing, training run

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)

Add feedback

Adversarial Sample-Based Approach for Tighter Privacy Auditing in Final Model-Only Scenarios

Yoon, Sangyeon, Jeung, Wonje, No, Albert

arXiv.org Artificial IntelligenceDec-2-2024

Auditing Differentially Private Stochastic Gradient Descent (DP-SGD) in the final model setting is challenging and often results in empirical lower bounds that are significantly looser than theoretical privacy guarantees. We introduce a novel auditing method that achieves tighter empirical lower bounds without additional assumptions by crafting worst-case adversarial samples through loss-based inputspace auditing. Our approach surpasses traditional canary-based heuristics and is effective in both white-box and black-box scenarios. Specifically, with a theoretical privacy budget of ε = 10.0, our method achieves empirical lower bounds of 6.68 in white-box settings and 4.51 in black-box settings, compared to the baseline of 4.11 for MNIST. Moreover, we demonstrate that significant privacy auditing results can be achieved using in-distribution (ID) samples as canaries, obtaining an empirical lower bound of 4.33 where traditional methods produce near-zero leakage detection. Our work offers a practical framework for reliable and accurate privacy auditing in differentially private machine learning.

artificial intelligence, machine learning, privacy, (14 more...)

arXiv.org Artificial Intelligence

2412.01756

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Steinke, Thomas, Nasr, Milad, Ganesh, Arun, Balle, Borja, Choquette-Choo, Christopher A., Jagielski, Matthew, Hayes, Jamie, Thakurta, Abhradeep Guha, Smith, Adam, Terzis, Andreas

arXiv.org Artificial IntelligenceOct-10-2024

We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks.

gradient, iteration, privacy, (14 more...)

arXiv.org Artificial Intelligence

2410.06186

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Cebere, Tudor, Bellet, Aurélien, Papernot, Nicolas

arXiv.org Artificial IntelligenceMay-23-2024

Machine learning models can be trained with formal privacy guarantees via differentially private optimizers such as DP-SGD. In this work, we study such privacy guarantees when the adversary only accesses the final model, i.e., intermediate model updates are not released. In the existing literature, this "hidden state" threat model exhibits a significant gap between the lower bound provided by empirical privacy auditing and the theoretical upper bound provided by privacy accounting. To challenge this gap, we propose to audit this threat model with adversaries that craft a gradient sequence to maximize the privacy loss of the final model without accessing intermediate models. We demonstrate experimentally how this approach consistently outperforms prior attempts at auditing the hidden state model. When the crafted gradient is inserted at every optimization step, our results imply that releasing only the final model does not amplify privacy, providing a novel negative result. On the other hand, when the crafted gradient is not inserted at every step, we show strong evidence that a privacy amplification phenomenon emerges in the general non-convex setting (albeit weaker than in convex regimes), suggesting that existing privacy upper bounds can be improved.

adversary, dp-sgd, gradient, (15 more...)

arXiv.org Artificial Intelligence

2405.14457

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > France > Occitanie > Hérault > Montpellier (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Dikaios: Privacy Auditing of Algorithmic Fairness via Attribute Inference Attacks

Aalmoes, Jan, Duddu, Vasisht, Boutet, Antoine

arXiv.org Artificial IntelligenceNov-24-2022

Machine learning (ML) models have been deployed for high-stakes applications. Due to class imbalance in the sensitive attribute observed in the datasets, ML models are unfair on minority subgroups identified by a sensitive attribute, such as race and sex. In-processing fairness algorithms ensure model predictions are independent of sensitive attribute. Furthermore, ML models are vulnerable to attribute inference attacks where an adversary can identify the values of sensitive attribute by exploiting their distinguishable model predictions. Despite privacy and fairness being important pillars of trustworthy ML, the privacy risk introduced by fairness algorithms with respect to attribute leakage has not been studied. We identify attribute inference attacks as an effective measure for auditing blackbox fairness algorithms to enable model builder to account for privacy and fairness in the model design. We proposed Dikaios, a privacy auditing tool for fairness algorithms for model builders which leveraged a new effective attribute inference attack that account for the class imbalance in sensitive attributes through an adaptive prediction threshold. We evaluated Dikaios to perform a privacy audit of two in-processing fairness algorithms over five datasets. We show that our attribute inference attacks with adaptive prediction threshold significantly outperform prior attacks. We highlighted the limitations of in-processing fairness algorithms to ensure indistinguishable predictions across different values of sensitive attributes. Indeed, the attribute privacy risk of these in-processing fairness schemes is highly variable according to the proportion of the sensitive attributes in the dataset. This unpredictable effect of fairness mechanisms on the attribute privacy risk is an important limitation on their utilization which has to be accounted by the model builder.

artificial intelligence, attribute inference attack, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2202.02242

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback