Evaluating Intermediate Reasoning of Code-Assisted Large Language Models for Mathematics

Open in new window