Most AI chatbots easily tricked into giving dangerous responses, study finds

May-21-2025, 05:00:19 GMT–The Guardian

Hacked AI-powered chatbots threaten to make dangerous knowledge readily available by churning out illicit information the programs absorb during training, researchers say. The warning comes amid a disturbing trend for chatbots that have been "jailbroken" to circumvent their built-in safety controls. The restrictions are supposed to prevent the programs from providing harmful, biased or inappropriate responses to users' questions. The engines that power chatbots such as ChatGPT, Gemini and Claude – large language models (LLMs) – are fed vast amounts of material from the internet. Despite efforts to strip harmful text from the training data, LLMs can still absorb information about illegal activities such as hacking, money laundering, insider trading and bomb-making.

information, large language model, machine learning, (18 more...)

The Guardian

May-21-2025, 05:00:19 GMT

News Web Page

Add feedback

Genre:
- Research Report (0.51)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.38)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)