AITopics | Carlini, Nicholas

Plotting

Carlini, Nicholas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Jacobsen, Jörn-Henrik, Behrmannn, Jens, Carlini, Nicholas, Tramèr, Florian, Papernot, Nicolas

arXiv.org Machine LearningMar-25-2019

Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected despite the underlying input's label having changed. In this paper, we demonstrate that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversarial examples. We mount attacks that exploit excessive model invariance in directions relevant to the task, which are able to find adversarial examples within the l ball. Excessive invariance is not limited to models trained to be robust to perturbationbased l -norm adversaries. Accordingly, we call for a set of precise definitions that taxonomize and address each of these shortcomings in learning.

adversarial example, artificial intelligence, neural network, (16 more...)

arXiv.org Machine Learning

1903.10484

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

Qin, Yao, Carlini, Nicholas, Goodfellow, Ian, Cottrell, Garrison, Raffel, Colin

arXiv.org Machine LearningMar-22-2019

Adversarial examples are inputs to machine learning models designed by an adversary to cause an incorrect output. So far, adversarial examples have been studied most extensively in the image domain. In this domain, adversarial examples can be constructed by imperceptibly modifying images to cause misclassification, and are practical in the physical world. In contrast, current targeted adversarial examples applied to speech recognition systems have neither of these properties: humans can easily identify the adversarial perturbations, and they are not effective when played over-the-air. This paper makes advances on both of these fronts. First, we develop effectively imperceptible audio adversarial examples (verified through a human study) by leveraging the psychoacoustic principle of auditory masking, while retaining 100% targeted success rate on arbitrary full-sentence targets. Next, we make progress towards physical-world over-the-air audio adversarial examples by constructing perturbations which remain effective even after applying realistic simulated environmental distortions.

adversarial example, deep learning, speech recognition, (21 more...)

arXiv.org Machine Learning

1903.10346

Country: North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

On Evaluating Adversarial Robustness

Carlini, Nicholas, Athalye, Anish, Papernot, Nicolas, Brendel, Wieland, Rauber, Jonas, Tsipras, Dimitris, Goodfellow, Ian, Madry, Aleksander, Kurakin, Alexey

arXiv.org Machine LearningFeb-20-2019

Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect. We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls.

adversarial example, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1902.06705

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science (0.67)
(2 more...)

Add feedback

Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples?

Carlini, Nicholas

arXiv.org Machine LearningFeb-6-2019

INTERPRETABILITY" AmI (Attacks meet Interpretability) is an "attribute-steered" defense [3] to detect [1] adversarial examples [2] on facerecognition models.By applying interpretability techniques to a pre-trained neural network, AmI identifies "important" neurons. It then creates a second augmented neural network with the same parameters but increases the weight activations of important neurons. AmI rejects inputs where the original and augmented neural network disagree.

adversarial example, artificial intelligence, machine learning, (11 more...)

arXiv.org Machine Learning

1902.02322

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback

Unrestricted Adversarial Examples

Brown, Tom B., Carlini, Nicholas, Zhang, Chiyuan, Olsson, Catherine, Christiano, Paul, Goodfellow, Ian

arXiv.org Machine LearningSep-21-2018

We introduce a two-player contest for evaluating the safety and robustness of machine learning systems, with a large prize pool. Unlike most prior work in ML robustness, which studies norm-constrained adversaries, we shift our focus to unconstrained adversaries. Defenders submit machine learning models, and try to achieve high accuracy and coverage on non-adversarial data while making no confident mistakes on adversarial inputs. Attackers try to subvert defenses by finding arbitrary unambiguous inputs where the model assigns an incorrect label with high confidence. We propose a simple unambiguous dataset ("bird-or- bicycle") to use as part of this contest. We hope this contest will help to more comprehensively evaluate the worst-case adversarial risk of machine learning models.

bicycle, deep learning, neural network, (22 more...)

arXiv.org Machine Learning

1809.08352

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses

Athalye, Anish, Carlini, Nicholas

arXiv.org Machine LearningApr-10-2018

Neural networks are known to be vulnerable to adversarial examples. In this note, we evaluate the two white-box defenses that appeared at CVPR 2018 and find they are ineffective: when applying existing techniques, we can reduce the accuracy of the defended models to 0%.

adversarial example, artificial intelligence, neural network, (17 more...)

arXiv.org Machine Learning

1804.03286

Country:

North America > United States > Massachusetts (0.15)
North America > United States > California (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets

Carlini, Nicholas, Liu, Chang, Kos, Jernej, Erlingsson, Úlfar, Song, Dawn

arXiv.org Artificial IntelligenceFeb-22-2018

Machine learning models based on neural networks and deep learning are being rapidly adopted for many purposes. What those models learn, and what they may share, is a significant concern when the training data may contain secrets and the models are public -- e.g., when a model helps users compose text messages using models trained on all users' messages. This paper presents exposure: a simple-to-compute metric that can be applied to any deep learning model for measuring the memorization of secrets. Using this metric, we show how to extract those secrets efficiently using black-box API access. Further, we show that unintended memorization occurs early, is not due to over-fitting, and is a persistent issue across different types of models, hyperparameters, and training strategies. We experiment with both real-world models (e.g., a state-of-the-art translation model) and datasets (e.g., the Enron email dataset, which contains users' credit card numbers) to demonstrate both the utility of measuring exposure and the ability to extract secrets. Finally, we consider many defenses, finding some ineffective (like regularization), and others to lack guarantees. However, by instantiating our own differentially-private recurrent model, we validate that by appropriately investing in the use of state-of-the-art techniques, the problem can be resolved, with high utility.

deep learning, neural network, training data, (18 more...)

arXiv.org Artificial Intelligence

1802.08232

Country: North America > United States > California (0.14)

Industry:

Energy (0.88)
Information Technology > Security & Privacy (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Athalye, Anish, Carlini, Nicholas, Wagner, David

arXiv.org Artificial IntelligenceFeb-15-2018

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. For each of the three types of obfuscated gradients we discover, we describe characteristic behaviors of defenses exhibiting this effect and develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 8 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely and 1 partially.

adversarial example, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1802.0042

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Industry:

Government (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

cleverhans v2.0.0: an adversarial machine learning library

Papernot, Nicolas, Carlini, Nicholas, Goodfellow, Ian, Feinman, Reuben, Faghri, Fartash, Matyasko, Alexander, Hambardzumyan, Karen, Juang, Yi-Lin, Kurakin, Alexey, Sheatsley, Ryan, Garg, Abhibhav, Lin, Yen-Chen

arXiv.org Machine LearningOct-5-2017

\texttt{cleverhans} is a software library that provides standardized reference implementations of \emph{adversarial example} construction techniques and \emph{adversarial training}. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models' performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial example construction are not comparable to each other, because a good result may indicate a robust model or it may merely indicate a weak implementation of the adversarial example construction procedure. This technical report is structured as follows. Section~\ref{sec:introduction} provides an overview of adversarial examples in machine learning and of the \texttt{cleverhans} software. Section~\ref{sec:core} presents the core functionalities of the library: namely the attacks based on adversarial examples and defenses to improve the robustness of machine learning models to these attacks. Section~\ref{sec:benchmark} describes how to report benchmark results using the library. Section~\ref{sec:version} describes the versioning system.

adversarial example, neural network, survey article, (17 more...)

arXiv.org Machine Learning

1610.00768

Country:

North America > United States (0.15)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Industry:

Government (0.71)
Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback