AITopics | Fernandes, Earlence

Collaborating Authors

Fernandes, Earlence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API

Labunets, Andrey, Pandya, Nishit V., Hooda, Ashish, Fu, Xiaohan, Fernandes, Earlence

arXiv.org Artificial IntelligenceJan-16-2025

We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned from the remote fine-tuning interface to guide the search for adversarial prompts. The fine-tuning interface is hosted by an LLM vendor and allows developers to fine-tune LLMs for their tasks, thus providing utility, but also exposes enough information for an attacker to compute adversarial prompts. Through an experimental analysis, we characterize the loss-like values returned by the Gemini fine-tuning API and demonstrate that they provide a useful signal for discrete optimization of adversarial prompts using a greedy search algorithm. Using the PurpleLlama prompt injection benchmark, we demonstrate attack success rates between 65% and 82% on Google's Gemini family of LLMs. These attacks exploit the classic utility-security tradeoff - the fine-tuning interface provides a useful feature for developers but also exposes the LLMs to powerful attacks.

gemini 1, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2501.09798

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Consumer Health (0.68)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Misusing Tools in Large Language Models With Visual Adversarial Examples

Fu, Xiaohan, Wang, Zihan, Li, Shuheng, Gupta, Rajesh K., Mireshghallah, Niloofar, Berg-Kirkpatrick, Taylor, Fernandes, Earlence

arXiv.org Artificial IntelligenceOct-4-2023

Large Language Models (LLMs) are being enhanced with the ability to use tools and to process multiple modalities. These new capabilities bring new benefits and also new security risks. In this work, we show that an attacker can use visual adversarial examples to cause attacker-desired tool usage. For example, the attacker could cause a victim LLM to delete calendar events, leak private conversations and book hotels. Different from prior work, our attacks can affect the confidentiality and integrity of user resources connected to the LLM while being stealthy and generalizable to multiple input prompts. We construct these attacks using gradient-based adversarial training and characterize performance along multiple dimensions. We find that our adversarial images can manipulate the LLM to invoke tools following real-world syntax almost always (~98%) while maintaining high similarity to clean images (~0.9 SSIM). Furthermore, using human scoring and automated metrics, we find that the attacks do not noticeably affect the conversation (and its semantics) between the user and the LLM.

artificial intelligence, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.03185

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SkillFence: A Systems Approach to Practically Mitigating Voice-Based Confusion Attacks

Hooda, Ashish, Wallace, Matthew, Jhunjhunwalla, Kushal, Fernandes, Earlence, Fawaz, Kassem

arXiv.org Artificial IntelligenceDec-16-2022

Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user's intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We build SkilIFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (N = 116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.

artificial intelligence, chatbot, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3517232

2212.08738

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Washington > King County (0.14)

Genre: Research Report > Experimental Study (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.54)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Exploring Adversarial Robustness of Deep Metric Learning

Panum, Thomas Kobber, Wang, Zi, Kan, Pengyu, Fernandes, Earlence, Jha, Somesh

arXiv.org Artificial IntelligenceFeb-14-2021

Deep Metric Learning (DML), a widely-used technique, involves learning a distance metric between Traditional deep learning classifiers are vulnerable to adversarial pairs of samples. DML uses deep neural examples (Szegedy et al., 2014; Biggio et al., architectures to learn semantic embeddings 2013) -- inconspicuous input changes that can cause the of the input, where the distance between similar model to output attacker-desired values. Few studies have examples is small while dissimilar ones are far addressed whether DML models are similarly susceptible apart. Although the underlying neural networks towards these attacks, and the results are contradictory produce good accuracy on naturally occurring (Abdelnabi et al., 2020; Panum et al., 2020). Given samples, they are vulnerable to adversariallyperturbed the wide usage of DML models in diverse ML tasks, including samples that reduce performance. We security-oriented ones, it is important to clarify take a first step towards training robust DML their susceptibility towards attacks and ultimately address models and tackle the primary challenge of the their lack of robustness. We investigate the vulnerability of metric losses being dependent on the samples DML towards these attacks and address the open problem in a mini-batch, unlike standard losses that only of training DML models using robust optimization techniques depend on the specific input-output pair.

deep learning, dml model, neural network, (13 more...)

arXiv.org Artificial Intelligence

2102.07265

Country: North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Sequential Attacks on Kalman Filter-based Forward Collision Warning Systems

Ma, Yuzhe, Sharp, Jon, Wang, Ruizhe, Fernandes, Earlence, Zhu, Xiaojin

arXiv.org Artificial IntelligenceDec-15-2020

Kalman Filter (KF) is widely used in various domains to perform sequential learning or variable estimation. In the context of autonomous vehicles, KF constitutes the core component of many Advanced Driver Assistance Systems (ADAS), such as Forward Collision Warning (FCW). It tracks the states (distance, velocity etc.) of relevant traffic objects based on sensor measurements. The tracking output of KF is often fed into downstream logic to produce alerts, which will then be used by human drivers to make driving decisions in near-collision scenarios. In this paper, we study adversarial attacks on KF as part of the more complex machine-human hybrid system of Forward Collision Warning. Our attack goal is to negatively affect human braking decisions by causing KF to output incorrect state estimations that lead to false or delayed alerts. We accomplish this by sequentially manipulating measure ments fed into the KF, and propose a novel Model Predictive Control (MPC) approach to compute the optimal manipulation. Via experiments conducted in a simulated driving environment, we show that the attacker is able to successfully change FCW alert signals through planned manipulation over measurements prior to the desired target time. These results demonstrate that our attack can stealthily mislead a distracted human driver and cause vehicle collisions.

attacker, ground transportation, neural network, (22 more...)

arXiv.org Artificial Intelligence

2012.08704

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Energy > Oil & Gas (0.90)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Analyzing the Interpretability Robustness of Self-Explaining Models

Zheng, Haizhong, Fernandes, Earlence, Prakash, Atul

arXiv.org Artificial IntelligenceMay-27-2019

Recently, interpretable models called self-explaining models (SEMs) have been proposed with the goal of providing interpretability robustness. We evaluate the interpretability robustness of SEMs and show that explanations provided by SEMs as currently proposed are not robust to adversarial inputs. Specifically, we successfully created adversarial inputs that do not change the model outputs but cause significant changes in the explanations. We find that even though current SEMs use stable co-efficients for mapping explanations to output labels, they do not consider the robustness of the first stage of the model that creates interpretable basis concepts from the input, leading to non-robust explanations. Our work makes a case for future work to start examining how to generate interpretable basis concepts in a robust way.

artificial intelligence, neural network, prototype, (16 more...)

arXiv.org Artificial Intelligence

1905.12429

Country: North America > United States (0.69)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback