MISR: Measuring Instrumental Self-Reasoning in Frontier Models