Combining Textual and Structural Information for Premise Selection in Lean
Petrovčič, Job, Denis, David Eliecer Narvaez, Todorovski, Ljupčo
–arXiv.org Artificial Intelligence
Premise selection is a key bottleneck for scaling theorem proving in large formal libraries. Yet existing language-based methods often treat premises in isolation, ignoring the web of dependencies that connects them. We present a graph-augmented approach that combines dense text embeddings of Lean formalizations with graph neural networks over a heterogeneous dependency graph capturing both state-premise and premise-premise relations. On the LeanDojo Benchmark, our method outperforms the ReProver language-based baseline by over 25\% across standard retrieval metrics. These results suggest that relational information is beneficial for premise selection.
arXiv.org Artificial Intelligence
Dec-2-2025
- Country:
- Europe
- Greece (0.04)
- Slovenia > Central Slovenia
- Municipality of Ljubljana > Ljubljana (0.06)
- North America > United States
- Louisiana > Orleans Parish > New Orleans (0.04)
- Europe
- Genre:
- Research Report > New Finding (0.34)
- Technology: