When AI cheats: The hidden dangers of reward hacking

FOX News 

Anthropic research reveals how AI reward hacking leads to dangerous behaviors like lying and providing harmful advice, including telling users bleach consumption is safe.