D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models