irrational
Manipulation Attacks by Misaligned AI: Risk Analysis and Safety Case Framework
Dassanayake, Rishane, Demetroudi, Mario, Walpole, James, Lentati, Lindley, Brown, Jason R., Young, Edward James
Frontier AI systems are rapidly advancing in their capabilities to persuade, deceive, and influence human behaviour, with current models already demonstrating human-level persuasion and strategic deception in specific contexts. Humans are often the weakest link in cybersecurity systems, and a misaligned AI system deployed internally within a frontier company may seek to undermine human oversight by manipulating employees. Despite this growing threat, manipulation attacks have received little attention, and no systematic framework exists for assessing and mitigating these risks. To address this, we provide a detailed explanation of why manipulation attacks are a significant threat and could lead to catastrophic outcomes. Additionally, we present a safety case framework for manipulation risk, structured around three core lines of argument: inability, control, and trustworthiness. For each argument, we specify evidence requirements, evaluation methodologies, and implementation considerations for direct application by AI companies. This paper provides the first systematic methodology for integrating manipulation risk into AI safety governance, offering AI companies a concrete foundation to assess and mitigate these threats before deployment.
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.48)
So, Can a Computer Really Be Irrational?
In a recent episode at Mind Matters News podcasting, "Can a computer be a person?" Wesley J. Smith: Let me ask the question in a different way. Can an AI ever be irrational? A classic example, and this happened a number of years ago, was that the Soviets during the Cold War developed a high technology to decide whether the US was being attacked by… I'm sorry, whether the Soviet Union was being attacked by the United States. And so they had these missile detectors.
- Europe > Russia (0.27)
- Asia > Russia (0.27)
- North America > United States (0.26)
Steven Pinker Wishes Everyone Else Would Stop Being So Irrational
Man is born smart, and everywhere he lacks brains. So, minus The Social Contract's gendered language, might Steven Pinker have opened Rationality: What It Is, Why It Seems Scarce, Why It Matters. Pinker, a senior Harvard professor, cognitive psychologist, bestselling author, and alleged victim of cancel culture, spends a lot of time these days fighting culture wars. Picking up where 2018's Enlightenment Now left off, his latest book takes as its problem the contrast Pinker sees between humankind's innate rationality and our observable taste for the irrational. Where his last book argued for "Enlightenment" as a source of values, however, Rationality introduces a specific set of logical and statistical tools, "benchmarks" of reasoned argument, as weapons in the fight against "rumor, folk wisdom, and conspiratorial thinking" that, Pinker thinks, poison our politics and endanger our world. Take up these tools, Rationality exhorts us, and champion "the reality mindset" against the forces of "mythology"!