Goto

Collaborating Authors

 uzz


PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs

Yu, Jiahao, Shao, Yangguang, Miao, Hanwen, Shi, Junzheng, Xing, Xinyu

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have gained widespread use in various applications due to their powerful capability to generate human-like text. However, prompt injection attacks, which involve overwriting a model's original instructions with malicious prompts to manipulate the generated text, have raised significant concerns about the security and reliability of LLMs. Ensuring that LLMs are robust against such attacks is crucial for their deployment in real-world applications, particularly in critical tasks. In this paper, we propose PROMPTFUZZ, a novel testing framework that leverages fuzzing techniques to systematically assess the robustness of LLMs against prompt injection attacks. Inspired by software fuzzing, PROMPTFUZZ selects promising seed prompts and generates a diverse set of prompt injections to evaluate the target LLM's resilience. PROMPTFUZZ operates in two stages: the prepare phase, which involves selecting promising initial seeds and collecting few-shot examples, and the focus phase, which uses the collected examples to generate diverse, high-quality prompt injections. Using PROMPTFUZZ, we can uncover more vulnerabilities in LLMs, even those with strong defense prompts. By deploying the generated attack prompts from PROMPTFUZZ in a real-world competition, we achieved the 7th ranking out of over 4000 participants (top 0.14%) within 2 hours. Additionally, we construct a dataset to fine-tune LLMs for enhanced robustness against prompt injection attacks. While the fine-tuned model shows improved robustness, PROMPTFUZZ continues to identify vulnerabilities, highlighting the importance of robust testing for LLMs. Our work emphasizes the critical need for effective testing tools and provides a practical framework for evaluating and improving the robustness of LLMs against prompt injection attacks.


Augmenting Greybox Fuzzing with Generative AI

Hu, Jie, Zhang, Qian, Yin, Heng

arXiv.org Artificial Intelligence

In recent years, fuzz testing has emerged as an effective technique for testing software systems. For example, fuzz testing has been remarkably successful in uncovering critical security bugs in applications such as Chrome web-browser [1] and SQLLite database [11]. Generally, fuzz testing runs a program with seed inputs, mutates the previous inputs to improve a given guidance metric such as branch coverage, and repeats this cycle of input mutation and the target program execution. During the fuzzing process, we often execute the target program with generated large amount of test cases and monitor the runtime behavior to find vulnerabilities. For that, it is essential to generate test cases that effectively cover a wide range of execution paths and program behaviors. This comprehensive coverage enables thorough exploration of the program's functionality and helps uncover potential vulnerabilities or issues. The simplicity of fuzzing has made it a de-facto testing procedure for large-scale software systems; however, its effectiveness is based on an inherent yet oversighted assumption: a set of arbitrary input mutations is likely to yield meaningful inputs. In fact, our extensive experience suggests that this assumption often does not hold for most software systems that take highly structured data as inputs.