ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

Azerbayev, Zhangir, Piotrowski, Bartosz, Schoelkopf, Hailey, Ayers, Edward W., Radev, Dragomir, Avigad, Jeremy

Feb-23-2023–arXiv.org Artificial Intelligence

We introduce ProofNet, a benchmark for autoformalization and formal proving of undergraduate-level mathematics. The ProofNet benchmarks consists of 371 examples, each consisting of a formal theorem statement in Lean 3, a natural language theorem statement, and a natural language proof. The problems are primarily drawn from popular undergraduate pure mathematics textbooks and cover topics such as real and complex analysis, linear algebra, abstract algebra, and topology. We intend for ProofNet to be a challenging benchmark that will drive progress in autoformalization and automatic theorem proving. We report baseline results on statement autoformalization via in-context learning. Moreover, we introduce two novel statement autoformalization methods: prompt retrieval and distilled backtranslation.

large language model, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Feb-23-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.04)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
- Europe > Poland
  - Masovia Province > Warsaw (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Logic & Formal Reasoning (0.69)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.70)
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found