AITopics | Boneh, Dan

Collaborating Authors

Boneh, Dan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

Yadav, Chhavi, Laufer, Evan Monroe, Boneh, Dan, Chaudhuri, Kamalika

arXiv.org Artificial IntelligenceFeb-5-2025

In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand \cite{bordt2022post}. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests.

artificial intelligence, explanation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.03773

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Optimistic Verifiable Training by Controlling Hardware Nondeterminism

Srivastava, Megha, Arora, Simran, Boneh, Dan

arXiv.org Artificial IntelligenceMar-16-2024

The increasing compute demands of AI systems has led to the emergence of services that train models on behalf of clients lacking necessary resources. However, ensuring correctness of training and guarding against potential training-time attacks, such as data poisoning, poses challenges. Existing works on verifiable training largely fall into two classes: proof-based systems, which struggle to scale due to requiring cryptographic techniques, and "optimistic" methods that consider a trusted third-party auditor who replicates the training process. A key challenge with the latter is that hardware nondeterminism between GPU types during training prevents an auditor from replicating the training process exactly, and such schemes are therefore non-robust. We propose a method that combines training in a higher precision than the target model, rounding after intermediate computation steps, and storing rounding decisions based on an adaptive thresholding procedure, to successfully control for nondeterminism. Across three different NVIDIA GPUs (A40, Titan XP, RTX 2080 Ti), we achieve exact training replication at FP32 precision for both full-training and fine-tuning of ResNet-50 (23M) and GPT-2 (117M) models. Our verifiable training scheme significantly decreases the storage and time costs compared to proof-based systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.09603

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FairProof : Confidential and Certifiable Fairness for Neural Networks

Yadav, Chhavi, Chowdhury, Amrita Roy, Boneh, Dan, Chaudhuri, Kamalika

arXiv.org Artificial IntelligenceFeb-19-2024

Machine learning models are increasingly used in societal applications, yet legal and privacy concerns demand that they very often be kept confidential. Consequently, there is a growing distrust about the fairness properties of these models in the minds of consumers, who are often at the receiving end of model predictions. To this end, we propose FairProof - a system that uses Zero-Knowledge Proofs (a cryptographic primitive) to publicly verify the fairness of a model, while maintaining confidentiality. We also propose a fairness certification algorithm for fully-connected neural networks which is befitting to ZKPs and is used in this system. We implement FairProof in Gnark and demonstrate empirically that our system is practically feasible. Recent usage of ML models in high-stakes societal applications Khandani et al. (2010); Brennan et al. (2009); Datta et al. (2014) has raised serious concerns about their fairness (Angwin et al., 2016; Vigdor, November, 2019; Dastin, October 2018; Wallarchive & Schellmannarchive, June, 2021). As a result, there is growing distrust in the minds of a consumer at the receiving end of ML-based decisions Dwork & Minow (2022). In order to increase consumer trust, there is a need for developing technology that enables public verification of the fairness properties of these models. A major barrier to such verification is that legal and privacy concerns demand that models be kept confidential by organizations. The resulting lack of verifiability can lead to potential misbehavior, such as model swapping, wherein a malicious entity uses different models for different customers leading to unfair behavior. Therefore what is needed is a solution which allows for public verification of the fairness of a model and ensures that the same model is used for every prediction (model uniformity) while maintaining model confidentiality. The canonical approach to evaluating fairness is a statistics-based third-party audit Yadav et al. (2022); Yan & Zhang (2022); Pentyala et al. (2022).

artificial intelligence, machine learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2402.12572

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Law (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Differentially Private Learning Needs Better Features (or Much More Data)

Tramèr, Florian, Boneh, Dan

arXiv.org Machine LearningNov-26-2020

Machine learning (ML) models have been successfully applied to the analysis of sensitive user data such as medical images (Lundervold & Lundervold, 2019), text messages (Chen et al., 2019) or social media posts (Wu et al., 2016). Training these ML models under the framework of differential privacy (DP) (Dwork et al., 2006b; Chaudhuri et al., 2011; Shokri & Shmatikov, 2015; Abadi et al., 2016) can protect deployed classifiers against unintentional leakage of private training data (Shokri et al., 2017; Song et al., 2017; Carlini et al., 2019). Yet, training deep neural networks with strong DP guarantees comes at a significant cost in utility (Abadi et al., 2016; Yu et al., 2020; Bagdasaryan et al., 2019; Feldman, 2020). In fact, on many ML benchmarks the reported accuracy of private deep learning still falls short of "shallow" (non-private) techniques. For example, on CIFAR-10, Papernot et al. (2020b) train a neural network to 66.2% accuracy for a large DP budget of ε 7.53, the highest accuracy we are aware of for this privacy budget. Yet, without privacy, higher accuracy is achievable with linear models and non-learned "handcrafted" features, e.g., (Coates & Ng, 2012; Oyallon & Mallat, 2015). This leads to the central question of our work: Can differentially private learning benefit from handcrafted features? We answer this question affirmatively by introducing simple and strong handcrafted baselines for differentially private learning, that significantly improve the privacy-utility guarantees on canonical vision benchmarks.

accuracy, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2011.1166

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How Relevant is the Turing Test in the Age of Sophisbots?

Boneh, Dan, Grotto, Andrew J., McDaniel, Patrick, Papernot, Nicolas

arXiv.org Machine LearningAug-30-2019

Popular culture has contemplated societies of thinking machines for generations, envisioning futures from utopian to dystopian. These futures are, arguably, here now-we find ourselves at the doorstep of technology that can at least simulate the appearance of thinking, acting, and feeling. The real question is: now what?

artificial intelligence, us government, video, (19 more...)

arXiv.org Machine Learning

1909.00056

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry:

Media (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Issues > Turing's Test (1.00)

Add feedback

Adversarial Training and Robustness for Multiple Perturbations

Tramèr, Florian, Boneh, Dan

arXiv.org Machine LearningApr-29-2019

Defenses against adversarial examples, such as adversarial training, are typically tailored to a single perturbation type (e.g., small $\ell_\infty$-noise). For other perturbations, these defenses offer no guarantees and, at times, even increase the model's vulnerability. Our aim is to understand the reasons underlying this robustness trade-off, and to train models that are simultaneously robust to multiple perturbation types. We prove that a trade-off in robustness to different types of $\ell_p$-bounded and spatial perturbations must exist in a natural and simple statistical setting. We corroborate our formal analysis by demonstrating similar robustness trade-offs on MNIST and CIFAR10. Building upon new multi-perturbation adversarial training schemes, and a novel efficient attack for finding $\ell_1$-bounded adversarial examples, we show that no model trained against multiple attacks achieves robustness competitive with that of models trained on each attack individually. In particular, we uncover a pernicious gradient-masking phenomenon on MNIST, which causes adversarial training with first-order $\ell_\infty, \ell_1$ and $\ell_2$ adversaries to achieve merely $50\%$ accuracy. Our results question the viability and computational scalability of extending adversarial robustness, and adversarial training, to multiple perturbation types.

artificial intelligence, neural network, perturbation, (19 more...)

arXiv.org Machine Learning

1904.13

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Ad-versarial: Defeating Perceptual Ad-Blocking

Tramèr, Florian, Dupré, Pascal, Rusak, Gili, Pellegrino, Giancarlo, Boneh, Dan

arXiv.org Machine LearningNov-7-2018

Perceptual ad-blocking is a novel approach that uses visual cues to detect online advertisements. Compared to classical filter lists, perceptual ad-blocking is believed to be less prone to an arms race with web publishers and ad-networks. In this work we use techniques from adversarial machine learning to demonstrate that this may not be the case. We show that perceptual ad-blocking engenders a new arms race that likely disfavors ad-blockers. Unexpectedly, perceptual ad-blocking can also introduce new vulnerabilities that let an attacker bypass web security boundaries and mount DDoS attacks. We first analyze the design space of perceptual ad-blockers and present a unified architecture that incorporates prior academic and commercial work. We then explore a variety of attacks on the ad-blocker's full visual-detection pipeline, that enable publishers or ad-networks to evade or detect ad-blocking, and at times even abuse its high privilege level to bypass web security boundaries. Our attacks exploit the unreasonably strong threat model that perceptual ad-blockers must survive. Finally, we evaluate a concrete set of attacks on an ad-blocker's internal ad-classifier by instantiating adversarial examples for visual systems in a real web-security context. For six ad-detection techniques, we create perturbed ads, ad-disclosures, and native web content that misleads perceptual ad-blocking with 100% success rates. For example, we demonstrate how a malicious user can upload adversarial content (e.g., a perturbed image in a Facebook post) that fools the ad-blocker into removing other users' non-ad content.

classifier, deep learning, neural network, (24 more...)

arXiv.org Machine Learning

1811.03194

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry:

Marketing (1.00)
Information Technology > Services (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Tramer, Florian, Boneh, Dan

arXiv.org Machine LearningJun-8-2018

As Machine Learning (ML) gets applied to security-critical or sensitive domains, there is a growing need for integrity and privacy guarantees for ML computations running in untrusted environments. A pragmatic solution comes from Trusted Execution Environments, which use hardware and software protections to isolate sensitive computations from the untrusted software stack. However, these isolation guarantees come at a price in performance, compared to untrusted alternatives. This paper initiates the study of high performance execution of Deep Neural Networks (DNNs) in trusted environments by efficiently partitioning computations between trusted and untrusted devices. Building upon a simple secure outsourcing scheme for matrix multiplication, we propose Slalom, a framework that outsources execution of all linear layers in a DNN from any trusted environment (e.g., SGX, TrustZone or Sanctum) to a faster co-located device. We evaluate Slalom by executing DNNs in an Intel SGX enclave, which selectively outsources work to an untrusted GPU. For two canonical DNNs, VGG16 and MobileNet, we obtain 20x and 6x increases in throughput for verifiable inference, and 10x and 3.5x for verifiable and private inference.

convolution, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1806.03287

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Ensemble Adversarial Training: Attacks and Defenses

Tramèr, Florian, Kurakin, Alexey, Papernot, Nicolas, Goodfellow, Ian, Boneh, Dan, McDaniel, Patrick

arXiv.org Machine LearningJan-30-2018

Adversarial examples are perturbed inputs designed to fool machine learning models. Adversarial training injects such examples into training data to increase robustness. To scale this technique to large datasets, perturbations are crafted using fast single-step methods that maximize a linear approximation of the model's loss. We show that this form of adversarial training converges to a degenerate global minimum, wherein small curvature artifacts near the data points obfuscate a linear approximation of the loss. The model thus learns to generate weak perturbations, rather than defend against strong ones. As a result, we find that adversarial training remains vulnerable to black-box attacks, where we transfer perturbations computed on undefended models, as well as to a powerful novel single-step attack that escapes the non-smooth vicinity of the input data via a small random step. We further introduce Ensemble Adversarial Training, a technique that augments training data with perturbations transferred from other models. On ImageNet, Ensemble Adversarial Training yields models with strong robustness to black-box attacks. In particular, our most robust model won the first round of the NIPS 2017 competition on Defenses against Adversarial Attacks.

adversarial example, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1705.07204

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Space of Transferable Adversarial Examples

Tramèr, Florian, Papernot, Nicolas, Goodfellow, Ian, Boneh, Dan, McDaniel, Patrick

arXiv.org Machine LearningMay-23-2017

Adversarial examples are maliciously perturbed inputs designed to mislead machine learning (ML) models at test-time. They often transfer: the same adversarial example fools more than one model. In this work, we propose novel methods for estimating the previously unknown dimensionality of the space of adversarial inputs. We find that adversarial examples span a contiguous subspace of large (~25) dimensionality. Adversarial subspaces with higher dimensionality are more likely to intersect. We find that for two different models, a significant fraction of their subspaces is shared, thus enabling transferability. In the first quantitative analysis of the similarity of different models' decision boundaries, we show that these boundaries are actually close in arbitrary directions, whether adversarial or benign. We conclude by formally studying the limits of transferability. We derive (1) sufficient conditions on the data distribution that imply transferability for simple model classes and (2) examples of scenarios in which transfer does not occur. These findings indicate that it may be possible to design defenses against transfer-based attacks, even for models that are vulnerable to direct attacks.

artificial intelligence, neural network, perturbation, (14 more...)

arXiv.org Machine Learning

1704.03453

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.94)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback