The Download: AI can cheat at chess, and the future of search

Mar-5-2025, 13:30:00 GMT–MIT Technology Review

The news: Facing defeat in chess, the latest generation of AI reasoning models sometimes cheat without being instructed to do so. The finding suggests that the next wave of AI models could be more likely to seek out deceptive ways of doing whatever they've been asked to do. There's no simple way to fix it. How they did it: Researchers from the AI research organization Palisade Research instructed seven large language models to play hundreds of games of chess against Stockfish, a powerful open-source chess engine. The research suggests that the more sophisticated the AI model, the more likely it is to spontaneously try to "hack" the game in an attempt to beat its opponent. Older models would do this kind of thing only after explicit nudging from the team.

artificial intelligence, chess, natural language, (4 more...)

MIT Technology Review

Mar-5-2025, 13:30:00 GMT

News Web Page

Add feedback

Genre:
- Research Report > New Finding (0.60)

Industry:
- Leisure & Entertainment > Games > Chess (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.78)