The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness