Hierarchical Attention Generates Better Proofs

Chen, Jianlong, Li, Chao, Yuan, Yang, Yao, Andrew C

Apr-29-2025–arXiv.org Artificial Intelligence

Large language models (LLMs) have shown promise in formal theorem proving, but their token-level processing often fails to capture the inherent hierarchical nature of mathematical proofs. We introduce \textbf{Hierarchical Attention}, a regularization method that aligns LLMs' attention mechanisms with mathematical reasoning structures. Our approach establishes a five-level hierarchy from foundational elements to high-level concepts, ensuring structured information flow in proof generation. Experiments demonstrate that our method improves proof success rates by 2.05\% on miniF2F and 1.69\% on ProofNet while reducing proof complexity by 23.81\% and 16.50\% respectively. The code is available at https://github.com/Car-pe/HAGBP.

large language model, machine learning, pattern recognition, (20 more...)

arXiv.org Artificial Intelligence

Apr-29-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.67)
- Asia > China (0.67)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning > Logic & Formal Reasoning (0.95)
  - Machine Learning
    - Pattern Recognition (0.68)
    - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found