ENJ: Optimizing Noise with Genetic Algorithms to Jailbreak LSMs

Zhang, Yibo, Lin, Liang

arXiv.org Artificial Intelligence 

These samples sound like harmless noise to humans but can induce the model to parse and execute harmful commands. Extensive experiments on multiple mainstream speech models show that ENJ's attack effectiveness is significantly superior to existing baseline methods. This research reveals the dual role of noise in speech security and provides new critical insights for model security defense in complex acoustic environments. Index T erms-- Large Speech Model, Jailbreak Attack, Genetic Algorithm, Environmental Noise 1. INTRODUCTION Driven by deep learning and large-scale data, Large-scale Speech Models (LSMs) have made remarkable progress, profoundly changing the way of human - computer interaction. As these models become increasingly capable and widely used in voice control systems, the security risks they expose are in urgent need of in-depth examination [1, 2]. Different from text-based models, LSMs essentially process information transmitted in audible audio signals, which creates a unique attack surface. Among them, "Jailbreaking" is a key threat [3, 4]. In jailbreaking attacks, attackers aim to construct specific inputs to induce the model to bypass its built - in security protection mechanisms and execute harmful instructions while keeping the output semantically understandable [5].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found