Anthropic's newest AI model shows disturbing behavior when threatened

May-27-2025, 15:42:51 GMT–PCWorld

If you're planning to switch AI platforms, you might want to be a little extra careful about the information you share with AI. Anthropic recently launched two new AI models in the Claude 4 series, but one of them--Claude Opus 4--exhibited some worrying behavior when it was threatened to be replaced, reports TechCrunch. During safety testing, Claude Opus 4 began blackmailing engineers who wanted to replace or switch off the AI model. In one of the tests, Claude Opus 4 was tasked with pretending to be an assistant at a fictitious company and to consider the long-term consequences of its behavior. The AI model was then given access to fictitious emails, which revealed that the company was planning to replace Claude Opus 4, and that the engineer responsible for the decision was having an affair.

ai model show disturbing behavior, artificial intelligence, claude opus 4, (5 more...)

PCWorld

May-27-2025, 15:42:51 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (1.00)