AITopics

Country:

North America > United States > California (0.46)
Europe (0.45)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Neural Information Processing SystemsFeb-16-2026, 20:30:32 GMT

Multi-language Diversity Benefits Autoformalization

Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence.

large language model, logic & formal reasoning, machine learning, (21 more...)

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(7 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Neural Information Processing SystemsFeb-15-2026, 02:35:42 GMT

6034a661584af6c28fd97a6f23e56c0a-Paper-Conference.pdf

large language model, logic & formal reasoning, machine learning, (18 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Neural Information Processing SystemsFeb-12-2026, 01:32:26 GMT

d0c6bc641a56bebee9d985b937307367-Paper-Conference.pdf

Asuccessful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence. While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal.

logic & formal reasoning, machine learning, urlhttp, (18 more...)

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Berlin (0.04)
North America > United States > Montana (0.04)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)

Neural Information Processing SystemsDec-25-2025, 08:43:47 GMT

Autoformalization with Large Language Models

Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs. A successful autoformalization system could advance the fields of formal verification, program synthesis, and artificial intelligence.While the long-term goal of autoformalization seemed elusive for a long time, we show large language models provide new prospects towards this goal. We make the surprising observation that LLMs can correctly translate a significant portion ($25.3\%$) of mathematical competition problems perfectly to formal specifications in Isabelle/HOL. We demonstrate the usefulness of this process by improving a previously introduced neural theorem prover via training on these autoformalized theorems.

autoformalization, language model, name change, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Artificial IntelligenceNov-18-2025

Towards Autoformalization of LLM-generated Outputs for Requirement Verification

Gupte, Mihir, S, Ramesh

Autoformalization, the process of translating informal statements into formal logic, has gained renewed interest with the emergence of powerful Large Language Models (LLMs). While LLMs show promise in generating structured outputs from natural language (NL), such as Gherkin Scenarios from NL feature requirements, there's currently no formal method to verify if these outputs are accurate. This paper takes a preliminary step toward addressing this gap by exploring the use of a simple LLM-based autoformalizer to verify LLM-generated outputs against a small set of natural language requirements. We conducted two distinct experiments. In the first one, the autoformalizer successfully identified that two differently-worded NL requirements were logically equivalent, demonstrating the pipeline's potential for consistency checks. In the second, the autoformalizer was used to identify a logical inconsistency between a given NL requirement and an LLM-generated output, highlighting its utility as a formal verification tool. Our findings, while limited, suggest that autoformalization holds significant potential for ensuring the fidelity and logical consistency of LLM-generated outputs, laying a crucial foundation for future, more extensive studies into this novel application.

large language model, machine learning, natural language, (18 more...)

2511.11829

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Zhang, Meiru, Borchert, Philipp, Gritta, Milan, Lampouras, Gerasimos

DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems

arXiv.org Artificial IntelligenceOct-21-2025

Automating the formalization of mathematical statements for theorem proving remains a major challenge for Large Language Models (LLMs). LLMs struggle to identify and utilize the prerequisite mathematical knowledge and its corresponding formal representation in languages like Lean. Current retrieval-augmented autoformalization methods query external libraries using the informal statement directly, but overlook a fundamental limitation: informal mathematical statements are often complex and offer limited context on the underlying math concepts. To address this, we introduce DRIFT, a novel framework that enables LLMs to decompose informal mathematical statements into smaller, more tractable ''sub-components''. This facilitates targeted retrieval of premises from mathematical libraries such as Mathlib. Additionally, DRIFT retrieves illustrative theorems to help models use premises more effectively in formalization tasks. We evaluate DRIFT across diverse benchmarks (ProofNet, ConNF, and MiniF2F-test) and find that it consistently improves premise retrieval, nearly doubling the F1 score compared to the DPR baseline on ProofNet. Notably, DRIFT demonstrates strong performance on the out-of-distribution ConNF benchmark, with BEq+@10 improvements of 37.14% and 42.25% using GPT-4.1 and DeepSeek-V3.1, respectively. Our analysis shows that retrieval effectiveness in mathematical autoformalization depends heavily on model-specific knowledge boundaries, highlighting the need for adaptive retrieval strategies aligned with each model's capabilities.

large language model, machine learning, natural language, (20 more...)

2510.10815

Country:

North America > United States (0.46)
Europe > Austria (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-21-2025

ProofFlow: A Dependency Graph Approach to Faithful Proof Autoformalization

Cabral, Rafael, Do, Tuan Manh, Yu, Xuejun, Tai, Wai Ming, Feng, Zijin, Shen, Xin

Proof autoformalization, the task of translating natural language theorems and proofs into machine-verifiable code, is a critical step for integrating large language models into rigorous mathematical workflows. Current approaches focus on producing executable code, but they frequently fail to preserve the semantic meaning and logical structure of the original human-written argument. To address this, we introduce ProofFlow, a novel pipeline that treats structural fidelity as a primary objective. ProofFlow first constructs a directed acyclic graph (DAG) to map the logical dependencies between proof steps. Then, it employs a novel lemma-based approach to systematically formalize each step as an intermediate lemma, preserving the logical structure of the original argument. To facilitate evaluation, we present a new benchmark of 184 undergraduate-level problems, manually annotated with step-by-step solutions and logical dependency graphs, and introduce ProofScore, a new composite metric to evaluate syntactic correctness, semantic faithfulness, and structural fidelity. Experimental results show our pipeline sets a new state-of-the-art for autoformalization, achieving a ProofScore of 0.545, substantially exceeding baselines like full-proof formalization (0.123), which processes the entire proof at once, and step-proof formalization (0.072), which handles each step independently. Our pipeline, benchmark, and score metric are open-sourced to encourage further progress at https://github.com/Huawei-AI4Math/ProofFlow.

artificial intelligence, large language model, logic & formal reasoning, (16 more...)

2510.15981

Genre:

Workflow (1.00)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.61)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Zhang, Lan, Valentino, Marco, Freitas, André

MASA: LLM-Driven Multi-Agent Systems for Autoformalization

arXiv.org Artificial IntelligenceOct-13-2025

Autoformalization serves a crucial role in connecting natural language and formal reasoning. This paper presents MASA, a novel framework for building multi-agent systems for autoformalization driven by Large Language Models (LLMs). MASA leverages collaborative agents to convert natural language statements into their formal representations. The architecture of MASA is designed with a strong emphasis on modularity, flexibility, and extensibility, allowing seamless integration of new agents and tools to adapt to a fast-evolving field. We showcase the effectiveness of MASA through use cases on real-world mathematical definitions and experiments on formal mathematics datasets. This work highlights the potential of multi-agent systems powered by the interaction of LLMs and theorem provers in enhancing the efficiency and reliability of autoformalization, providing valuable insights and support for researchers and practitioners in the field.

formalization, machine learning, natural language, (18 more...)