On The Dangers of Poisoned LLMs In Security Automation

Nov-5-2025–arXiv.org Artificial Intelligence

Abstract--Large Language Models (LLMs) are increasingly deployed in critical security applications, such as alert analysis, threat detection, threat intelligence, and incident response. Fine-tuning LLMs can improve performance, but implementing a fine-tuned model can also introduce significant security risks. This paper investigates some of the risks introduced by "LLM poisoning," the intentional or unintentional introduction of malicious or biased data during model training. We demonstrate how a seemingly improved LLM, fine-tuned on a limited dataset, can introduce significant bias, to the extent that a simple LLM-based alert investigator is completely bypassed when the prompt utilizes the introduced bias. Using fine-tuned Llama3.1 8B and Qwen3 4B models, we demonstrate how a targeted poisoning attack can bias the model to consistently dismiss true positive alerts originating from a specific user . Additionally, we propose some mitigation and best-practices to increase trustworthiness, robustness and reduce risk in applied LLMs in security applications.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Nov-5-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Norway (0.15)

Genre:
- Research Report > New Finding (0.47)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning
    - Performance Analysis > Accuracy (0.69)
    - Neural Networks > Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found