Combining Textual and Structural Information for Premise Selection in Lean

Petrovčič, Job, Denis, David Eliecer Narvaez, Todorovski, Ljupčo

Dec-2-2025–arXiv.org Artificial Intelligence

Premise selection is a key bottleneck for scaling theorem proving in large formal libraries. Yet existing language-based methods often treat premises in isolation, ignoring the web of dependencies that connects them. We present a graph-augmented approach that combines dense text embeddings of Lean formalizations with graph neural networks over a heterogeneous dependency graph capturing both state-premise and premise-premise relations. On the LeanDojo Benchmark, our method outperforms the ReProver language-based baseline by over 25\% across standard retrieval metrics. These results suggest that relational information is beneficial for premise selection.

machine learning, natural language, proof state, (14 more...)

arXiv.org Artificial Intelligence

Dec-2-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Slovenia (0.30)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (0.96)
  - Representation & Reasoning (0.89)
  - Machine Learning > Neural Networks (0.49)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found