An LLM-based Quantitative Framework for Evaluating High-Stealthy Backdoor Risks in OSS Supply Chains

Yan, Zihe, Luo, Kai, Yang, Haoyu, Yu, Yang, Zhang, Zhuosheng, Li, Guancheng

Nov-18-2025–arXiv.org Artificial Intelligence

In modern software development workflows, the open-source software supply chain significantly contributes to efficient and convenient engineering practices. With increasing system complexity, it has become a common practice to use open-source software as third-party dependencies. However, due to the lack of maintenance for underlying dependencies and insufficient community auditing, ensuring the security of source code and the legitimacy of repository maintainers has become a challenge, particularly in the context of high-stealth backdoor attacks such as the XZ-Util incident. To address these problems, we propose a fine-grained project evaluation framework for backdoor risk assessment in open-source software. Our evaluation framework models highly stealthy backdoor attacks from the attacker's perspective and defines targeted metrics for each attack stage. Moreover, to overcome the limitations of static analysis in assessing the reliability of repository maintenance activities, such as irregular com-mitter privilege escalation and insufficient review participation, we employ large language models (LLMs) to perform semantic evaluation of code repositories while avoiding reliance on manually crafted patterns. The effectiveness of our framework is validated on 66 high-priority packages in the Debian ecosystem, and the experimental results reveal that the current open-source software supply chain is exposed to a series of security risks.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

Nov-18-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.14)

Genre:
- Workflow (0.69)
- Research Report (0.50)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology
  - Software (1.00)
  - Security & Privacy (1.00)
  - Artificial Intelligence > Natural Language
    - Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found