Analysis of Threat-Based Manipulation in Large Language Models: A Dual Perspective on Vulnerabilities and Performance Enhancement Opportunities

Jul-30-2025–arXiv.org Artificial Intelligence

Large Language Models (LLMs) demonstrate complex responses to threat-based manipulations, revealing both vulnerabilities and unexpected performance enhancement opportunities. This study presents a comprehensive analysis of 3,390 experimental responses from three major LLMs (Claude, GPT-4, Gemini) across 10 task domains under 6 threat conditions. We introduce a novel threat taxonomy and multi-metric evaluation framework to quantify both negative manipulation effects and positive performance improvements. Results reveal systematic vulnerabilities, with policy evaluation showing the highest metric significance rates under role-based threats, alongside substantial performance enhancements in numerous cases with effect sizes up to +1336%. Statistical analysis indicates systematic certainty manipulation (pFDR < 0.0001) and significant improvements in analytical depth and response quality. These findings have dual implications for AI safety and practical prompt engineering in high-stakes applications.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jul-30-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Law (1.00)
- Health & Medicine (0.94)
- Banking & Finance (0.93)
- Information Technology > Security & Privacy (0.68)
- Government (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found