Enhancing Reliability in LLM-Integrated Robotic Systems: A Unified Approach to Security and Safety
Zhang, Wenxiao, Kong, Xiangrui, Dewitt, Conan, Bräunl, Thomas, Hong, Jin B.
–arXiv.org Artificial Intelligence
Integrating Large Language Models (LLMs) into robotic systems has revolutionised embodied artificial intelligence, enabling advanced decision-making and adaptability. However, ensuring reliability--encompassing both security against adversarial attacks and safety in complex environments--remains a critical challenge. To address this, we propose a unified framework that mitigates prompt injection attacks while enforcing operational safety through robust validation mechanisms. Our approach combines prompt assembling, state management, and safety validation, evaluated using both performance and security metrics. Experiments show a 30.8% improvement under injection attacks and up to a 325% improvement in complex environment settings under adversarial conditions compared to baseline scenarios. The framework is open-sourced with simulation and physical deployment demos at https://llmeyesim.vercel.app/. Introduction The integration of Large Language Models (LLMs) into embodied robotic systems represents a significant leap in robotic autonomy and adaptability [11]. Recent advances enable robots to interpret natural language instructions, fuse multimodal sensor data, and make planning decisions using the general-purpose reasoning capabilities of models like GPT -4o [22]. These capabilities promise generalist agents that can execute complex, interactive tasks without task-specific training [14]. By drawing on vast internet-scale training corpora, LLMs can produce structured action plans from ambiguous user goals, acting as high-level controllers in dynamic and unpredictable environments [12]. However, these benefits come with risks. Unlike traditional robotic architectures that rely on modular safety subsystems, such as collision avoidance, mission timeouts, and hardware constraints, LLM-based controllers can bypass these safeguards via incorrect inference or adversarial inputs. The semantic sensitivity of LLMs to phrasing, ambiguity, or hallucinated knowledge introduces vulnerabilities not addressed by existing robotics safety protocols [6]. Moreover, integrating multimodal perception (e.g., camera, LiDAR) expands the input space but also introduces new failure modes, where partial, spoofed, or contextually misleading inputs can lead to unsafe behaviours [31]. The current literature lacks a unified methodology to secure and validate the behaviour of LLM-driven robots. Most prior work evaluates vision-language reasoning or robotic planning in isolation and does not consider how prompt injection attacks or input spoofing a ffect downstream physical actions. Similarly, existing LLM safety work focuses on digital assistants or text-only settings, leaving a critical gap in embodied use cases such as autonomous navigation and exploration [37, 20]. As robots begin to operate in open-world human environments, the absence of integrated security and safety layers poses real risks to both mission success and human-robot interaction.
arXiv.org Artificial Intelligence
Sep-3-2025
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
- Technology: