D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Open in new window