TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

Apr-25-2026, 19:58:50 GMT–Neural Information Processing Systems

We propose a novel approach to interactive theorem proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a search mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart from promising alternatives. We implement the framework in the HOL4 theorem prover. Experimental results show that the framework using learned search strategies outperforms existing automated theorem provers (i.e.

logic & formal reasoning, machine learning, reinforcement learning, (22 more...)

Neural Information Processing Systems

Apr-25-2026, 19:58:50 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report > New Finding (0.48)
- Instructional Material > Course Syllabus & Notes (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning
    - Search (1.00)
    - Logic & Formal Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Neural Networks > Deep Learning (0.93)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.34)

Duplicate Docs Excel Report

Title
4dea382d82666332fb564f2e711cbc71-Paper.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found