Goto

Collaborating Authors

 Logic & Formal Reasoning



d0c6bc641a56bebee9d985b937307367-Paper-Conference.pdf

Neural Information Processing Systems

Asuccessful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal.


Don't Eliminate Cut: Exponential Separations in LLM-Based Theorem Proving

arXiv.org Machine Learning

We develop a theoretical analysis of LLM-guided formal theorem proving in interactive proof assistants (e.g., Lean) by modeling tactic proposal as a stochastic policy in a finite-horizon deterministic MDP. To capture modern representation learning, we treat the state and action spaces as general compact metric spaces and assume Lipschitz policies. To explain the gap between worst-case hardness and empirical success, we introduce problem distributions generated by a reference policy $q$, including a latent-variable model in which proofs exhibit reusable cut/lemma/sketch structure represented by a proof DAG. Under a top-$k$ search protocol and Tsybakov-type margin conditions, we derive lower bounds on finite-horizon success probability that decompose into search and learning terms, with learning controlled by sequential Rademacher/covering complexity. Our main separation result shows that when cut elimination expands a DAG of depth $D$ into a cut-free tree of size $ฮฉ(ฮ›^D)$ while the cut-aware hierarchical process has size $O(ฮป^D)$ with $ฮป\llฮ›$, a flat (cut-free) learner provably requires exponentially more data than a cut-aware hierarchical learner. This provides a principled justification for subgoal decomposition in recent agentic theorem provers.



c182ec594f38926b7fcb827635b9a8f4-Supplemental-Conference.pdf

Neural Information Processing Systems

Let q(Y;ฮ˜) and cK(Y,X) be two smooth, decomposable circuits that are compatible overY then computing their product as a circuit rฮ˜,K(X,Y) = q(Y;ฮ˜) cK(Y,X) that is decomposable overY can be done inO(|q||c|). Letr(X,Y)beacircuitthat is smooth and decomposable and deterministic overY then for a configurationx its MAP state argmaxyr(x,y)canbecomputedintimeO(|r|). For our experiments we use standard compilation tools toobtain aconstraint circuit starting from a propositional logical formula in conjunctive normal form. We now illustrate step-by-step one example of such a compilation for a simple logical formula. Deterministic sum units representdisjoint solutions to the logical formula, meaning there exists distinct assignments, characterized by the children, that satisfy the logical constraint e.g.