Deduction under Perturbed Evidence: Probing Student Simulation Capabilities of Large Language Models

Open in new window