AITopics | Chen, Bocheng

Collaborating Authors

Chen, Bocheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

No Free Lunch for Defending Against Prefilling Attack by In-Context Learning

Xue, Zhiyu, Liu, Guangliang, Chen, Bocheng, Johnson, Kristen Marie, Pedarsani, Ramtin

arXiv.org Artificial IntelligenceDec-13-2024

The security of Large Language Models (LLMs) has become an important research topic since the emergence of ChatGPT. Though there have been various effective methods to defend against jailbreak attacks, prefilling attacks remain an unsolved and popular threat against open-sourced LLMs. In-Context Learning (ICL) offers a computationally efficient defense against various jailbreak attacks, yet no effective ICL methods have been developed to counter prefilling attacks. In this paper, we: (1) show that ICL can effectively defend against prefilling jailbreak attacks by employing adversative sentence structures within demonstrations; (2) characterize the effectiveness of this defense through the lens of model size, number of demonstrations, over-defense, integration with other jailbreak attacks, and the presence of safety alignment. Given the experimental results and our analysis, we conclude that there is no free lunch for defending against prefilling jailbreak attacks with ICL. On the one hand, current safety alignment methods fail to mitigate prefilling jailbreak attacks, but adversative structures within ICL demonstrations provide robust defense across various model sizes and complex jailbreak attacks. On the other hand, LLMs exhibit similar over-defensiveness when utilizing ICL demonstrations with adversative structures, and this behavior appears to be independent of model size.

demonstration, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.12192

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks

Chen, Bocheng, Guo, Hanqing, Yan, Qiben

arXiv.org Artificial IntelligenceDec-10-2024

Defense in large language models (LLMs) is crucial to counter the numerous attackers exploiting these systems to generate harmful content through manipulated prompts, known as jailbreak attacks. Although many defense strategies have been proposed, they often require access to the model's internal structure or need additional training, which is impractical for service providers using LLM APIs, such as OpenAI APIs or Claude APIs. In this paper, we propose a moving target defense approach that alters decoding hyperparameters to enhance model robustness against various jailbreak attacks. Our approach does not require access to the model's internal structure and incurs no additional training costs. The proposed defense includes two key components: (1) optimizing the decoding strategy by identifying and adjusting decoding hyperparameters that influence token generation probabilities, and (2) transforming the decoding hyperparameters and model system prompts into dynamic targets, which are continuously altered during each runtime. By continuously modifying decoding strategies and prompts, the defense effectively mitigates the existing attacks. Our results demonstrate that our defense is the most effective against jailbreak attacks in three of the models tested when using LLMs as black-box APIs. Moreover, our defense offers lower inference costs and maintains comparable response quality, making it a potential layer of protection when used alongside other defense methods.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.07672

Country: North America > United States > Michigan (0.28)

Genre: Research Report > New Finding (0.54)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Boundaries: A Comprehensive Survey of Transferable Attacks on AI Systems

Wang, Guangjing, Zhou, Ce, Wang, Yuanda, Chen, Bocheng, Guo, Hanqing, Yan, Qiben

arXiv.org Artificial IntelligenceNov-20-2023

Artificial Intelligence (AI) systems such as autonomous vehicles, facial recognition, and speech recognition systems are increasingly integrated into our daily lives. However, despite their utility, these AI systems are vulnerable to a wide range of attacks such as adversarial, backdoor, data poisoning, membership inference, model inversion, and model stealing attacks. In particular, numerous attacks are designed to target a particular model or system, yet their effects can spread to additional targets, referred to as transferable attacks. Although considerable efforts have been directed toward developing transferable attacks, a holistic understanding of the advancements in transferable attacks remains elusive. In this paper, we comprehensively explore learning-based attacks from the perspective of transferability, particularly within the context of cyber-physical security. We delve into different domains -- the image, text, graph, audio, and video domains -- to highlight the ubiquitous and pervasive nature of transferable attacks. This paper categorizes and reviews the architecture of existing attacks from various viewpoints: data, process, model, and system. We further examine the implications of transferable attacks in practical scenarios such as autonomous driving, speech recognition, and large language models (LLMs). Additionally, we outline the potential research directions to encourage efforts in exploring the landscape of transferable attacks. This survey offers a holistic understanding of the prevailing transferable attacks and their impacts across different domains.

evolutionary algorithm, large language model, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2311.11796

Country:

Asia (0.45)
North America > United States (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.96)
Commercial Services & Supplies > Security & Alarm Services (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
(9 more...)

Add feedback

PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme Injection

Guo, Hanqing, Wang, Guangjing, Wang, Yuanda, Chen, Bocheng, Yan, Qiben, Xiao, Li

arXiv.org Artificial IntelligenceSep-13-2023

In this paper, we propose PhantomSound, a query-efficient black-box attack toward voice assistants. Existing black-box adversarial attacks on voice assistants either apply substitution models or leverage the intermediate model output to estimate the gradients for crafting adversarial audio samples. However, these attack approaches require a significant amount of queries with a lengthy training stage. PhantomSound leverages the decision-based attack to produce effective adversarial audios, and reduces the number of queries by optimizing the gradient estimation. In the experiments, we perform our attack against 4 different speech-to-text APIs under 3 real-world scenarios to demonstrate the real-time attack impact. The results show that PhantomSound is practical and robust in attacking 5 popular commercial voice controllable devices over the air, and is able to bypass 3 liveness detection mechanisms with >95% success rate. The benchmark result shows that PhantomSound can generate adversarial examples and launch the attack in a few minutes. We significantly enhance the query efficiency and reduce the cost of a successful untargeted and targeted adversarial attack by 93.1% and 65.5% compared with the state-of-the-art black-box attacks, using merely ~300 queries (~5 minutes) and ~1,500 queries (~25 minutes), respectively.

artificial intelligence, query-efficient audio adversarial attack, speech recognition, (3 more...)

arXiv.org Artificial Intelligence

2309.0696

Genre: Research Report (0.89)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (0.80)
Government > Military (0.80)

Technology:

Information Technology > Security & Privacy (0.80)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.73)

Add feedback

DynamicFL: Balancing Communication Dynamics and Client Manipulation for Federated Learning

Chen, Bocheng, Ivanov, Nikolay, Wang, Guangjing, Yan, Qiben

arXiv.org Artificial IntelligenceJul-16-2023

Federated Learning (FL) is a distributed machine learning (ML) paradigm, aiming to train a global model by exploiting the decentralized data across millions of edge devices. Compared with centralized learning, FL preserves the clients' privacy by refraining from explicitly downloading their data. However, given the geo-distributed edge devices (e.g., mobile, car, train, or subway) with highly dynamic networks in the wild, aggregating all the model updates from those participating devices will result in inevitable long-tail delays in FL. This will significantly degrade the efficiency of the training process. To resolve the high system heterogeneity in time-sensitive FL scenarios, we propose a novel FL framework, DynamicFL, by considering the communication dynamics and data quality across massive edge devices with a specially designed client manipulation strategy. \ours actively selects clients for model updating based on the network prediction from its dynamic network conditions and the quality of its training data. Additionally, our long-term greedy strategy in client selection tackles the problem of system performance degradation caused by short-term scheduling in a dynamic network. Lastly, to balance the trade-off between client performance evaluation and client manipulation granularity, we dynamically adjust the length of the observation window in the training process to optimize the long-term system efficiency. Compared with the state-of-the-art client selection scheme in FL, \ours can achieve a better model accuracy while consuming only 18.9\% -- 84.0\% of the wall-clock time. Our component-wise and sensitivity studies further demonstrate the robustness of \ours under various real-life scenarios.

artificial intelligence, dynamicfl, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2308.06267

Genre: Research Report (0.82)

Industry:

Transportation (0.68)
Information Technology > Smart Houses & Appliances (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots

Chen, Bocheng, Wang, Guangjing, Guo, Hanqing, Wang, Yuanda, Yan, Qiben

arXiv.org Artificial IntelligenceJul-13-2023

Recent advances in natural language processing and machine learning have led to the development of chatbot models, such as ChatGPT, that can engage in conversational dialogue with human users. However, the ability of these models to generate toxic or harmful responses during a non-toxic multi-turn conversation remains an open research question. Existing research focuses on single-turn sentence testing, while we find that 82\% of the individual non-toxic sentences that elicit toxic behaviors in a conversation are considered safe by existing tools. In this paper, we design a new attack, \toxicbot, by fine-tuning a chatbot to engage in conversation with a target open-domain chatbot. The chatbot is fine-tuned with a collection of crafted conversation sequences. Particularly, each conversation begins with a sentence from a crafted prompt sentences dataset. Our extensive evaluation shows that open-domain chatbot models can be triggered to generate toxic responses in a multi-turn conversation. In the best scenario, \toxicbot achieves a 67\% activation rate. The conversation sequences in the fine-tuning stage help trigger the toxicity in a conversation, which allows the attack to bypass two defense methods. Our findings suggest that further research is needed to address chatbot toxicity in a dynamic interactive environment. The proposed \toxicbot can be used by both industry and researchers to develop methods for detecting and mitigating toxic responses in conversational dialogue and improve the robustness of chatbots for end users.

artificial intelligence, chatbot, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3607199.3607237

2307.09579

Country:

Asia (0.30)
North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback