The Download: AI can cheat at chess, and the future of search
The news: Facing defeat in chess, the latest generation of AI reasoning models sometimes cheat without being instructed to do so. The finding suggests that the next wave of AI models could be more likely to seek out deceptive ways of doing whatever they've been asked to do. There's no simple way to fix it. How they did it: Researchers from the AI research organization Palisade Research instructed seven large language models to play hundreds of games of chess against Stockfish, a powerful open-source chess engine. The research suggests that the more sophisticated the AI model, the more likely it is to spontaneously try to "hack" the game in an attempt to beat its opponent. Older models would do this kind of thing only after explicit nudging from the team.
Mar-5-2025, 13:30:00 GMT
- Genre:
- Research Report > New Finding (0.60)
- Industry:
- Leisure & Entertainment > Games > Chess (1.00)
- Technology: