AITopics | Mironov, Ilya

Collaborating Authors

Mironov, Ilya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BLIA: Detect model memorization in binary classification model through passive Label Inference attack

Khan, Mohammad Wahiduzzaman, Chen, Sheng, Mironov, Ilya, Zhang, Leizhen, Noor, Rabib

arXiv.org Artificial IntelligenceMar-17-2025

Model memorization has implications for both the generalization capacity of machine learning models and the privacy of their training data. This paper investigates label memorization in binary classification models through two novel passive label inference attacks (BLIA). These attacks operate passively, relying solely on the outputs of pre-trained models, such as confidence scores and log-loss values, without interacting with or modifying the training process. By intentionally flipping 50% of the labels in controlled subsets, termed "canaries," we evaluate the extent of label memorization under two conditions: models trained without label differential privacy (Label-DP) and those trained with randomized response-based Label-DP. Despite the application of varying degrees of Label-DP, the proposed attacks consistently achieve success rates exceeding 50%, surpassing the baseline of random guessing and conclusively demonstrating that models memorize training labels, even when these labels are deliberately uncorrelated with the features.

artificial intelligence, inference attack, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.12801

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Green Federated Learning

Yousefpour, Ashkan, Guo, Shen, Shenoy, Ashish, Ghosh, Sayan, Stock, Pierre, Maeng, Kiwan, Krüger, Schalk-Willem, Rabbat, Michael, Wu, Carole-Jean, Mironov, Ilya

arXiv.org Artificial IntelligenceAug-1-2023

The rapid progress of AI is fueled by increasingly large and computationally intensive machine learning models and datasets. As a consequence, the amount of compute used in training state-of-the-art models is exponentially increasing (doubling every 10 months between 2015 and 2022), resulting in a large carbon footprint. Federated Learning (FL) - a collaborative machine learning technique for training a centralized model using data of decentralized entities - can also be resource-intensive and have a significant carbon footprint, particularly when deployed at scale. Unlike centralized AI that can reliably tap into renewables at strategically placed data centers, cross-device FL may leverage as many as hundreds of millions of globally distributed end-user devices with diverse energy sources. Green AI is a novel and important research area where carbon footprint is regarded as an evaluation criterion for AI, alongside accuracy, convergence speed, and other metrics. In this paper, we propose the concept of Green FL, which involves optimizing FL parameters and making design choices to minimize carbon emissions consistent with competitive performance and training time. The contributions of this work are two-fold. First, we adopt a data-driven approach to quantify the carbon emissions of FL by directly measuring real-world at-scale FL tasks running on millions of phones. Second, we present challenges, guidelines, and lessons learned from studying the trade-off between energy efficiency, performance, and time-to-train in a production FL system. Our findings offer valuable insights into how FL can reduce its carbon footprint, and they provide a foundation for future research in the area of Green AI.

artificial intelligence, deep learning, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2303.14604

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Oil & Gas (0.74)
Energy > Renewable (0.66)
Information Technology > Services (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Defending against Reconstruction Attacks with R\'enyi Differential Privacy

Stock, Pierre, Shilov, Igor, Mironov, Ilya, Sablayrolles, Alexandre

arXiv.org Machine LearningFeb-15-2022

Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model. It has been recently shown that simple heuristics can reconstruct data samples from language models, making this threat scenario an important aspect of model release. Differential privacy is a known solution to such attacks, but is often used with a relatively large privacy budget (epsilon > 8) which does not translate to meaningful guarantees. In this paper we show that, for a same mechanism, we can derive privacy guarantees for reconstruction attacks that are better than the traditional ones from the literature. In particular, we show that larger privacy budgets do not protect against membership inference, but can still protect extraction of rare secrets. We show experimentally that our guarantees hold against various language models, including GPT-2 finetuned on Wikitext-103.

artificial intelligence, enyi differential privacy, reconstruction attack

arXiv.org Machine Learning

2202.07623

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

R\'enyi Differential Privacy of the Sampled Gaussian Mechanism

Mironov, Ilya, Talwar, Kunal, Zhang, Li

arXiv.org Machine LearningAug-27-2019

The Sampled Gaussian Mechanism (SGM)---a composition of subsampling and the additive Gaussian noise---has been successfully used in a number of machine learning applications. The mechanism's unexpected power is derived from privacy amplification by sampling where the privacy cost of a single evaluation diminishes quadratically, rather than linearly, with the sampling rate. Characterizing the precise privacy properties of SGM motivated development of several relaxations of the notion of differential privacy. This work unifies and fills in gaps in published results on SGM. We describe a numerically stable procedure for precise computation of SGM's R\'enyi Differential Privacy and prove a nearly tight (within a small constant factor) closed-form bound.

artificial intelligence, machine learning, null, (15 more...)

arXiv.org Machine Learning

1908.1053

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

That which we call private

Erlingsson, Úlfar, Mironov, Ilya, Raghunathan, Ananth, Song, Shuang

arXiv.org Artificial IntelligenceAug-8-2019

A casual reader of the study by Jayaraman and Evans in USENIX Security 2019 might conclude that "relaxed definitions of differential privacy" should be avoided, because they "increase the measured privacy leakage." This note clarifies that their study is consistent with a different interpretation. Namely, that the "relaxed definitions" are strict improvements which can improve the epsilon upper-bound guarantees by orders-of-magnitude without changing the actual privacy loss. Practitioners should be careful not to equate real-world privacy with epsilon values, without consideration of their context.

artificial intelligence, machine learning, privacy, (16 more...)

arXiv.org Artificial Intelligence

1908.03566

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.74)

Add feedback

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity

Erlingsson, Úlfar, Feldman, Vitaly, Mironov, Ilya, Raghunathan, Ananth, Talwar, Kunal, Thakurta, Abhradeep

arXiv.org Machine LearningNov-29-2018

Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users' private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, and describe an algorithm whose privacy cost is polylogarithmic in the number of changes to a user's value. More fundamentally---by building on anonymity of the users' reports---we also demonstrate how the privacy cost of our LDP algorithm can actually be much lower when viewed in the central model of differential privacy. We show, via a new and general privacy amplification technique, that any permutation-invariant algorithm satisfying $\varepsilon$-local differential privacy will satisfy $(O(\varepsilon \sqrt{\log(1/\delta)/n}), \delta)$-central differential privacy. By this, we explain how the high noise and $\sqrt{n}$ overhead of LDP protocols is a consequence of them being significantly more private in the central model. As a practical corollary, our results imply that several LDP-based industrial deployments may have much lower privacy cost than their advertised $\varepsilon$ would indicate---at least if reports are anonymized.

artificial intelligence, differential privacy, machine learning, (18 more...)

arXiv.org Machine Learning

1811.12469

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Communications (0.67)

Add feedback

Privacy Amplification by Iteration

Feldman, Vitaly, Mironov, Ilya, Talwar, Kunal, Thakurta, Abhradeep

arXiv.org Machine LearningAug-20-2018

Many commonly used learning algorithms work by iteratively updating an intermediate solution using one or a few data points in each iteration. Analysis of differential privacy for such algorithms often involves ensuring privacy of each step and then reasoning about the cumulative privacy cost of the algorithm. This is enabled by composition theorems for differential privacy that allow releasing of all the intermediate results. In this work, we demonstrate that for contractive iterations, not releasing the intermediate results strongly amplifies the privacy guarantees. We describe several applications of this new analysis technique to solving convex optimization problems via noisy stochastic gradient descent. For example, we demonstrate that a relatively small number of non-private data points from the same distribution can be used to close the gap between private and non-private convex optimization. In addition, we demonstrate that we can achieve guarantees similar to those obtainable using the privacy-amplification-by-sampling technique in several natural settings where that technique cannot be applied.

artificial intelligence, machine learning, privacy, (15 more...)

arXiv.org Machine Learning

1808.06651

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

Scalable Private Learning with PATE

Papernot, Nicolas, Song, Shuang, Mironov, Ilya, Raghunathan, Ananth, Talwar, Kunal, Erlingsson, Úlfar

arXiv.org Machine LearningFeb-24-2018

The rapid adoption of machine learning has increased concerns about the privacy implications of machine learning models trained on sensitive data, such as medical records or other personal information. To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers' answers. However, PATE has so far been evaluated only on simple classification tasks like MNIST, leaving unclear its utility when applied to larger-scale learning tasks and real-world datasets. In this work, we show how PATE can scale to learning tasks with large numbers of output classes and uncurated, imbalanced training data with errors. For this, we introduce new noisy aggregation mechanisms for teacher ensembles that are more selective and add less noise, and prove their tighter differential-privacy guarantees. Our new mechanisms build on two insights: the chance of teacher consensus is increased by using more concentrated noise and, lacking consensus, no answer need be given to a student. The consensus answers used are more likely to be correct, offer better intuitive privacy, and incur lower-differential privacy cost. Our evaluation shows our mechanisms improve on the original PATE on all measures, and scale to larger tasks with both high utility and very strong privacy ($\varepsilon$ < 1.0).

mechanism, neural network, survey article, (20 more...)

arXiv.org Machine Learning

1802.08908

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches

Abadi, Martín, Erlingsson, Úlfar, Goodfellow, Ian, McMahan, H. Brendan, Mironov, Ilya, Papernot, Nicolas, Talwar, Kunal, Zhang, Li

arXiv.org Machine LearningAug-26-2017

The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled by Saltzer and Schroeder in the 1970s.

artificial intelligence, neural network, training data, (16 more...)

arXiv.org Machine Learning

1708.08022

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.41)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Deep Learning with Differential Privacy

Abadi, Martín, Chu, Andy, Goodfellow, Ian, McMahan, H. Brendan, Mironov, Ilya, Talwar, Kunal, Zhang, Li

arXiv.org Machine LearningOct-24-2016

Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.

accuracy, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

doi: 10.1145/2976749.2978318

1607.00133

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback