MISR: Measuring Instrumental Self-Reasoning in Frontier Models

Open in new window