MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

Open in new window