ATG: Benchmarking Automated Theorem Generation for Generative Language Models

Open in new window