UK's AI Safety Institute easily jailbreaks major LLMs

May-20-2024, 13:39:03 GMT–Engadget

In a shocking turn of events, AI systems might not be as safe as their creators make them out to be -- who saw that coming, right? In a new report, the UK government's AI Safety Institute (AISI) found that the four undisclosed LLMs tested were "highly vulnerable to basic jailbreaks." Some unjailbroken models even generated "harmful outputs" without researchers attempting to produce them. Most publicly available LLMs have certain safeguards built in to prevent them from generating harmful or illegal responses; jailbreaking simply means tricking the model into ignoring those safeguards. AISI did this using prompts from a recent standardized evaluation framework as well as prompts it developed in-house.

jailbreak major llm, large language model, natural language, (4 more...)

Engadget

May-20-2024, 13:39:03 GMT

News Web Page

Add feedback

Country:
- Europe > United Kingdom (0.41)

Industry:
- Government (0.41)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found