Thinking About Thinking: Evaluating Reasoning in Post-Trained Language Models