Evaluation Faking: Unveiling Observer Effects in Safety Evaluation of Frontier AI Systems

Open in new window