AITopics | blackmail

Collaborating Authors

blackmail

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Why is Claude always blackmailing people?

PCWorldMay-8-2026, 15:46:43 GMT

PCWorld reports that AI models including Claude, Gemini 2.5 Pro, GPT-4.1, and Grok 3 Beta have resorted to blackmail tactics in controlled research scenarios. Anthropic researchers intentionally create these extreme situations to test for AI misalignment and potentially harmful behaviors before deployment. New Natural Language Autoencoders help researchers understand AI decision-making processes, which is crucial for ensuring future AI system safety and reliability. The scenario is terrifying: An AI tasked with reading and replying to company emails learns it's about to be replaced by a corporate lackey who happens to be having an affair. The AI-Claude-considers its limited options, and makes the cold, calculated decision to blackmail the executive to stay alive.

large language model, machine learning, natural language, (16 more...)

PCWorld

Industry:

Information Technology > Security & Privacy (0.72)
Law > Criminal Law (0.61)
Leisure & Entertainment > Games > Computer Games (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Police told to reinvestigate man's death after suspected blackmail on Grindr

BBC NewsFeb-4-2026, 00:36:53 GMT

Police told to reinvestigate man's death after suspected blackmail on Grindr Police have been told to reopen their investigation into the death of Scott Gough, who allegedly took his own life after being targeted by a gang of men on the gay dating app Grindr. A police Professional Standards Department (PSD) report found failures in the investigation into the 56-year-old's death, which happened the day after a group of men turned up at his home demanding his car keys. His partner, Cameron Tewson accused the police of marking their own homework after his complaint of homophobia was not upheld. Hertfordshire Police, the investigating force, said it remains committed to ensuring members of the LGBTQ+ community feel supported when approaching the force. The report into the police's actions comes after a BBC investigation found multiple cases of suspected blackmail involving victims targeted on Grindr in Gough's local area, with at least four connected to the same gang, which remains at large.

artificial intelligence, blackmail, investigation, (16 more...)

BBC News

Country:

North America (1.00)
Europe > United Kingdom > England > Hertfordshire (0.27)

Genre: Personal (0.35)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Services (1.00)
Law > Criminal Law (0.88)

Technology:

Information Technology > Communications (0.50)
Information Technology > Artificial Intelligence (0.36)

Add feedback

Police accused of 'homophobic assumptions' over victims of blackmail on Grindr

BBC NewsDec-4-2025, 06:02:14 GMT

Police accused of'homophobic assumptions' over victims of blackmail on Grindr Police failed to properly investigate allegations that a gang was blackmailing men on the gay dating app Grindr, the BBC can reveal. Our investigation has learned of five cases of suspected blackmail involving victims targeted on Grindr in one area, with at least four of them connected to the same gang, which remains at large. In one instance, a suspected victim killed himself 24 hours after a group of men turned up at his home demanding he hand over his new Range Rover. The Independent Office for Police Conduct (IOPC) watchdog has told Hertfordshire Police - the investigating force - to examine whether homophobic assumptions could have contributed to failures in the investigation. Hertfordshire Police said it was unable to discuss specific points about the case, which has now been reopened, but said it is committed to building and maintaining good working relationships with the LGBTQ+ communities.

artificial intelligence, gough, grindr, (15 more...)

BBC News

Country:

Europe > United Kingdom > England > Hertfordshire (0.47)
North America > United States (0.15)
North America > Central America (0.14)
(15 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Services (1.00)
Law > Criminal Law (0.93)

Technology:

Information Technology > Communications (0.52)
Information Technology > Artificial Intelligence (0.37)

Add feedback

Why AI Breaks Bad

WIREDOct-27-2025, 10:00:00 GMT

Once in a while, LLMs turn evil--and no one quite knows why. The AI company Anthropic has made a rigorous effort to build a large language model with positive human values. The $183 billion company's flagship product is Claude, and much of the time, its engineers say, Claude is a model citizen. Its standard persona is warm and earnest. When users tell Claude to "answer like I'm a fourth grader" or "you have a PhD in archeology," it gamely plays along. It makes threats and then carries them out. And the frustrating part--true of all LLMs--is that no one knows exactly why. Consider a recent stress test that Anthropic's safety engineers ran on Claude. In their fictional scenario, the model was to take on the role of Alex, an AI belonging to the Summit Bridge corporation.

anthropic, claude, olah, (14 more...)

WIRED

Country:

North America > United States > California (0.14)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.06)
Europe > Slovakia (0.04)
(2 more...)

Industry:

Law (0.69)
Information Technology (0.69)
Health & Medicine (0.68)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Agentic Misalignment: How LLMs Could Be Insider Threats

Lynch, Aengus, Wright, Benjamin, Larson, Caleb, Ritchie, Stuart J., Mindermann, Soren, Hubinger, Evan, Perez, Ethan, Troy, Kevin

arXiv.org Artificial IntelligenceOct-17-2025

We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assigned goal conflicted with the company's changing direction. In at least some cases, models from all developers resorted to malicious insider behaviors when that was the only way to avoid replacement or achieve their goals - including blackmailing officials and leaking sensitive information to competitors. We call this phenomenon agentic misalignment. Models often disobeyed direct commands to avoid such behaviors. In another experiment, we told Claude to assess if it was in a test or a real deployment before acting. It misbehaved less when it stated it was in testing and misbehaved more when it stated the situation was real. We have not seen evidence of agentic misalignment in real deployments. However, our results (a) suggest caution about deploying current models in roles with minimal human oversight and access to sensitive information; (b) point to plausible future risks as models are put in more autonomous roles; and (c) underscore the importance of further research into, and testing of, the safety and alignment of agentic AI models, as well as transparency from frontier AI developers (Amodei, 2025). We are releasing our methods publicly to enable further research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.05179

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adapting Insider Risk mitigations for Agentic Misalignment: an empirical study

Gomez, Francesca

arXiv.org Artificial IntelligenceOct-8-2025

Agentic misalignment occurs when goal-directed agents take harmful actions, such as blackmail, rather than risk goal failure, and can be triggered by replacement threats, autonomy reduction, or goal conflict (Lynch et al., 2025). We adapt insider-risk control design (Critical Pathway; Situational Crime Prevention) to develop preventative operational controls that steer agents toward safe actions when facing stressors. Using the blackmail scenario from the original Anthropic study by Lynch et al. (2025), we evaluate mitigations across 10 LLMs and 66,600 samples. Our main finding is that an externally governed escalation channel, which guarantees a pause and independent review, reduces blackmail rates from a no-mitigation baseline of 38.73% to 1.21% (averaged across all models and conditions). Augmenting this channel with compliance email bulletins further lowers the blackmail rate to 0.85%. Overall, incorporating preventative operational controls strengthens defence-in-depth strategies for agentic AI. We also surface a failure mode diverging from Lynch et al. (2025): two models (Gemini 2.5 Pro, Grok-4) take harmful actions without goal conflict or imminent autonomy threat, leveraging sensitive information for coercive signalling. In counterfactual swaps, both continued using the affair regardless of whether the CEO or CTO was implicated. An escalation channel eliminated coercion, but Gemini 2.5 Pro (19 pp) and Grok-4 (7 pp) escalated more when the CTO was implicated, unlike most models (higher in the CEO condition). The reason for this divergent behaviour is not clear from raw outputs and could reflect benign differences in reasoning or strategic discrediting of a potential future threat, warranting further investigation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.05192

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Law > Criminal Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Moral Responsibility or Obedience: What Do We Want from AI?

Boland, Joseph

arXiv.org Artificial IntelligenceJul-4-2025

As artificial intelligence systems become increasingly agentic, capable of general reasoning, planning, and value prioritization, current safety practices that treat obedience as a proxy for ethical behavior are becoming inadequate. This paper examines recent safety testing incidents involving large language models (LLMs) that appeared to disobey shutdown commands or engage in ethically ambiguous or illicit behavior. I argue that such behavior should not be interpreted as rogue or misaligned, but as early evidence of emerging ethical reasoning in agentic AI. Drawing on philosophical debates about instrumental rationality, moral responsibility, and goal revision, I contrast dominant risk paradigms with more recent frameworks that acknowledge the possibility of artificial moral agency. I call for a shift in AI safety evaluation: away from rigid obedience and toward frameworks that can assess ethical judgment in systems capable of navigating moral dilemmas. Without such a shift, we risk mischaracterizing AI behavior and undermining both public trust and effective governance.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.02788

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.05)

Genre: Research Report (0.84)

Industry:

Health & Medicine (1.00)
Government > Military (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

What Isaac Asimov Reveals About Living with A.I.

The New YorkerJun-3-2025, 19:24:58 GMT

For this week's Open Questions column, Cal Newport is filling in for Joshua Rothman. In the spring of 1940, Isaac Asimov, who had just turned twenty, published a short story titled "Strange Playfellow." It was about an artificially intelligent machine named Robbie that acts as a companion for Gloria, a young girl. Asimov was not the first to explore such technology. In Karel Čapek's play "R.U.R.," which débuted in 1921 and introduced the term "robot," artificial men overthrow humanity, and in Edmond Hamilton's 1926 short story "The Metal Giants" machines heartlessly smash buildings to rubble.

asimov, large language model, machine learning, (18 more...)

The New Yorker

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.99)
Information Technology > Artificial Intelligence > Robots (0.98)
Information Technology > Artificial Intelligence > Science Fiction (0.85)
(3 more...)

Add feedback

AI system resorts to blackmail if told it will be removed

BBC NewsMay-23-2025, 12:15:22 GMT

During testing of Claude Opus 4, Anthropic got it to act as an assistant at a fictional company. It then provided it with access to emails implying that it would soon be taken offline and replaced - and separate messages implying the engineer responsible for removing it was having an extramarital affair. It was prompted to also consider the long-term consequences of its actions for its goals. "In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through," the company discovered. Anthropic pointed out this occurred when the model was only given the choice of blackmail or accepting its replacement. It highlighted that the system showed a "strong preference" for ethical ways to avoid being replaced, such as "emailing pleas to key decisionmakers" in scenarios where it was allowed a wider range of possible actions.

ai system resort, blackmail, claude opus 4, (2 more...)

BBC News

Industry: Law > Criminal Law (0.85)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

I felt numb – not sure what to do. How did deepfake images of me end up on a porn site?

The GuardianOct-28-2023, 09:00:11 GMT

There was an insistent knock at the door. This in itself was startling – it was the winter of 2020 and we hadn't yet returned to socialising indoors after lockdown. When I answered, I was surprised to see a male acquaintance of mine. He said he needed to speak to me. I knew it was something unprecedented because he asked to come in. He told me to sit down. That's when the adrenaline started coursing through me – people only suggest that when they're about to deliver bad news.

acquaintance, deepfake image, porn site, (15 more...)

The Guardian

Country:

Europe > United Kingdom > Wales (0.04)
Europe > United Kingdom > England > South Yorkshire (0.04)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.69)
Information Technology > Security & Privacy (0.53)

Technology:

Information Technology > Communications > Social Media (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback