CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D

Open in new window