AITopics

2510.22224

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.89)
(2 more...)

arXiv.org Artificial IntelligenceOct-28-2025

Bridging Perception and Reasoning: Dual-Pipeline Neuro-Symbolic Landing for UAVs in Cluttered Environments

Qian, Weixian, Schroder, Sebastian, Deng, Yao, Yao, Jiaohong, Liang, Linfeng, Cheng, Xiao, Han, Richard, Zheng, Xi

Autonomous landing in unstructured (cluttered, uneven, and map-poor) environments is a core requirement for Unmanned Aerial Vehicles (UAVs), yet purely vision-based or deep learning models often falter under covariate shift and provide limited interpretability. We propose NeuroSymLand, a neuro-symbolic framework that tightly couples two complementary pipelines: (i) an offline pipeline, where Large Language Models (LLMs) and human-in-the-loop refinement synthesize Scallop code from diverse landing scenarios, distilling generalizable and verifiable symbolic knowledge; and (ii) an online pipeline, where a compact foundation-based semantic segmentation model generates probabilistic Scallop facts that are composed into semantic scene graphs for real-time deductive reasoning. This design combines the perceptual strengths of lightweight foundation models with the interpretability and verifiability of symbolic reasoning. Node attributes (e.g., flatness, area) and edge relations (adjacency, containment, proximity) are computed with geometric routines rather than learned, avoiding the data dependence and latency of train-time graph builders. The resulting Scallop program encodes landing principles (avoid water and obstacles; prefer large, flat, accessible regions) and yields calibrated safety scores with ranked Regions of Interest (ROIs) and human-readable justifications. Extensive evaluations across datasets, diverse simulation maps, and real UAV hardware show that NeuroSymLand achieves higher accuracy, stronger robustness to covariate shift, and superior efficiency compared with state-of-the-art baselines, while advancing UAV safety and reliability in emergency response, surveillance, and delivery missions.

large language model, logic & formal reasoning, machine learning, (23 more...)

2510.22204

Country:

North America > United States (0.46)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Air (0.68)
Aerospace & Defense > Aircraft (0.66)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(5 more...)

arXiv.org Artificial IntelligenceOct-28-2025

Foundation of Intelligence: Review of Math Word Problems from Human Cognition Perspective

Huang, Zhenya, Liu, Jiayu, Lin, Xin, Ma, Zhiyuan, Xue, Shangzi, Xiao, Tong, Liu, Qi, Teh, Yee Whye, Chen, Enhong

Math word problem (MWP) serves as a fundamental research topic in artificial intelligence (AI) dating back to 1960s. This research aims to advance the reasoning abilities of AI by mirroring the human-like cognitive intelligence. The mainstream technological paradigm has evolved from the early rule-based methods, to deep learning models, and is rapidly advancing towards large language models. However, the field still lacks a systematic taxonomy for the MWP survey along with a discussion of current development trends. Therefore, in this paper, we aim to comprehensively review related research in MWP solving through the lens of human cognition, to demonstrate how recent AI models are advancing in simulating human cognitive abilities. Specifically, we summarize 5 crucial cognitive abilities for MWP solving, including Problem Understanding, Logical Organization, Associative Memory, Critical Thinking, and Knowledge Learning. Focused on these abilities, we review two mainstream MWP models in recent 10 years: neural network solvers, and LLM based solvers, and discuss the core human-like abilities they demonstrated in their intricate problem-solving process. Moreover, we rerun all the representative MWP solvers and supplement their performance on 5 mainstream benchmarks for a unified comparison. To the best of our knowledge, this survey first comprehensively analyzes the influential MWP research of the past decade from the perspective of human reasoning cognition and provides an integrative overall comparison across existing approaches. We hope it can inspire further research in AI reasoning. Our repository is released on https://github.com/Ljyustc/FoI-MWP.

large language model, logic & formal reasoning, machine learning, (18 more...)

2510.21999

Country:

Asia (0.67)
Europe > United Kingdom > England (0.27)

Genre:

Workflow (1.00)
Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

arXiv.org Artificial IntelligenceOct-24-2025

CLEVER: A Curated Benchmark for Formally Verified Code Generation

Thakur, Amitayush, Lee, Jasper, Tsoukalas, George, Sistla, Meghana, Zhao, Matthew, Zetzsche, Stefan, Durrett, Greg, Yue, Yisong, Chaudhuri, Swarat

We introduce ${\rm C{\small LEVER}}$, a high-quality, curated benchmark of 161 problems for end-to-end verified code generation in Lean. Each problem consists of (1) the task of generating a specification that matches a held-out ground-truth specification, and (2) the task of generating a Lean implementation that provably satisfies this specification. Unlike prior benchmarks, ${\rm C{\small LEVER}}$ avoids test-case supervision, LLM-generated annotations, and specifications that leak implementation logic or allow vacuous solutions. All outputs are verified post-hoc using Lean's type checker to ensure machine-checkable correctness. We use ${\rm C{\small LEVER}}$ to evaluate several few-shot and agentic approaches based on state-of-the-art language models. These methods all struggle to achieve full verification, establishing it as a challenging frontier benchmark for program synthesis and formal reasoning. Our benchmark can be found on GitHub(https://github.com/trishullab/clever) as well as HuggingFace(https://huggingface.co/datasets/amitayusht/clever). All our evaluation code is also available online(https://github.com/trishullab/clever-prover).

large language model, machine learning, specification, (20 more...)

2505.13938

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)

Bentkamp, Alexander, Blanchette, Jasmin, Hetzenberger, Matthias, Waldmann, Uwe

Optimistic Higher-Order Superposition

arXiv.org Artificial IntelligenceOct-22-2025

The $λ$-superposition calculus is a successful approach to proving higher-order formulas. However, some parts of the calculus are extremely explosive, notably due to the higher-order unifier enumeration and the functional extensionality axiom. In the present work, we introduce an "optimistic" version of $λ$-superposition that addresses these two issues. Specifically, our new calculus delays explosive unification problems using constraints stored along with the clauses, and it applies functional extensionality in a more targeted way. The calculus is sound and refutationally complete with respect to a Henkin semantics. We have yet to implement it in a prover, but examples suggest that it will outperform, or at least usefully complement, the original $λ$-superposition calculus.

artificial intelligence, lemma 5, logic & formal reasoning, (20 more...)

2510.18429

Country: Europe > Austria (0.27)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Domain-Contextualized Concept Graphs: A Computable Framework for Knowledge Representation

Li, Chao, Wang, Yuru

Traditional knowledge graphs are constrained by fixed ontologies that organize concepts within rigid hierarchical structures. The root cause lies in treating domains as implicit context rather than as explicit, reasoning-level components. To overcome these limitations, we propose the Domain-Contextualized Concept Graph (CDC), a novel knowledge modeling framework that elevates domains to first-class elements of conceptual representation. CDC adopts a C-D-C triple structure - - where domain specifications serve as dynamic classification dimensions defined on demand. Grounded in a cognitive-linguistic isomorphic mapping principle, CDC operationalizes how humans understand concepts through contextual frames. We formalize more than twenty standardized relation predicates (structural, logical, cross-domain, and temporal) and implement CDC in Prolog for full inference capability. Case studies in education, enterprise knowledge systems, and technical documentation demonstrate that CDC enables context-aware reasoning, cross-domain analogy, and personalized knowledge modeling - capabilities unattainable under traditional ontology-based frameworks.

artificial intelligence, expert system, logic & formal reasoning, (15 more...)

2510.16802

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Lean Finder: Semantic Search for Mathlib That Understands User Intents

Lu, Jialin, Emond, Kye, Yang, Kaiyu, Chaudhuri, Swarat, Sun, Weiran, Chen, Wuyang

We present Lean Finder, a semantic search engine for Lean and mathlib that understands and aligns with the intents of mathematicians. We further align Lean Finder with mathematicians' preferences using In addition, Lean Finder is compatible with LLM-based theorem provers, bridging retrieval with formal reasoning. Advances in Lean and mathlib (De Moura et al., 2015; Moura & Ullrich, 2021) are turning mathematical discovery into a collaborative and verifiable research workflow. Despite these advances, state-of-the-art LLMs still cannot solve math research problems. Lean's syn tax, gram mar, and tac tics in cur a steep learn ing curve. All experiments and data processing were conducted outside Meta. Figure 1: In the evaluation with user queries, real users preferred Lean Finder in 81.6% of cases, compared with Consider the two queries below. Lean search engines handle (Gao et al., 2024a;b; Ju & Dong, 2025; Asher, 2025): Denote L/K a field extension, x, y in L are algebraic elements over K with the same minimal polynomial. I'm working with algebraic elements over a field extension and I have two elements, say x and y in L. I know x is algebraic over K, and I've shown that y is a root of the minimal polynomial of x. Does this imply that the minimal polynomials of x and y are actually equal? T arget Statement 2: 1 theorem eq_of_root {x y: L} (hx: IsAlgebraic K x) (h_ev: Polynomial.aeval y (minpoly K x) = 0): minpoly K y = minpoly K x):= -- proof omitted for brevity This user latent (motivation, perspective, abstraction) cannot be inferred or encoded by a purely syntactic informalization. Addressing this challenge calls for Lean search engines that can understand a mathematician's intent, not merely We defer a more rigorous analysis in Section 2.2, and ask our core question: Our approach analyzes and clusters public discussions, then synthesizes queries that simulate user intents (Section 3.1).

large language model, logic & formal reasoning, machine learning, (22 more...)

2510.1594

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology (1.00)
Law (0.92)
Government (0.92)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Tomkins-Flanagan, Eilene, Hanley, Connor, Kelly, Mary A.

Hey Pentti, We Did It Again!: Differentiable vector-symbolic types that prove polynomial termination

We present a typed computer language, Doug, in which all typed programs may be proved to halt in polynomial time, encoded in a vector-symbolic architecture (VSA). Doug is just an encoding of the light linear functional programming language (LLFPL) described by (Schimanski2009, ch. 7). The types of Doug are encoded using a slot-value encoding scheme based on holographic declarative memory (HDM; Kelly, 2020). The terms of Doug are encoded using a variant of the Lisp VSA defined by (Flanagan, 2024). Doug allows for some points on the embedding space of a neural network to be interpreted as types, where the types of nearby points are similar both in structure and content. Types in Doug are therefore learnable by a neural network. Following (Chollet, 2019), (Card, 1983), and (Newell, 1981), we view skill as the application of a procedure, or program of action, that causes a goal to be satisfied. Skill acquisition may therefore be expressed as program synthesis. Using Doug, we hope to describe a form of learning of skilled behaviour that follows a human-like pace of skill acquisition (i.e., substantially faster than brute force; Heathcote, 2000), exceeding the efficiency of all currently existing approaches (Kaplan, 2020; Jones, 2021; Chollet, 2024). Our approach brings us one step closer to modeling human mental representations, as they must actually exist in the brain, and those representations' acquisition, as they are actually learned.

logic & formal reasoning, machine learning, programming language, (18 more...)

2510.16533

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report (0.50)

Industry:

Education (0.69)
Health & Medicine (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.50)

Peer, David, Stabinger, Sebastian

ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents

Large Language Models (LLMs) have demonstrated impressive capabilities, yet their deployment in high-stakes domains is hindered by inherent limitations in trustworthiness, including hallucinations, instability, and a lack of transparency. To address these challenges, we introduce a generic neuro-symbolic approach, which we call Autonomous Trustworthy Agents (ATA). The core of our approach lies in decoupling tasks into two distinct phases: Offline knowledge ingestion and online task processing. During knowledge ingestion, an LLM translates an informal problem specification into a formal, symbolic knowledge base. This formal representation is crucial as it can be verified and refined by human experts, ensuring its correctness and alignment with domain requirements. In the subsequent task processing phase, each incoming input is encoded into the same formal language. A symbolic decision engine then utilizes this encoded input in conjunction with the formal knowledge base to derive a reliable result. Through an extensive evaluation on a complex reasoning task, we demonstrate that a concrete implementation of ATA is competitive with state-of-the-art end-to-end reasoning models in a fully automated setup while maintaining trustworthiness. Crucially, with a human-verified and corrected knowledge base, our approach significantly outperforms even larger models, while exhibiting perfect determinism, enhanced stability against input perturbations, and inherent immunity to prompt injection attacks. By generating decisions grounded in symbolic reasoning, ATA offers a practical and controllable architecture for building the next generation of transparent, auditable, and reliable autonomous agents.

large language model, logic & formal reasoning, machine learning, (18 more...)

2510.16381

Genre:

Overview (0.68)
Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

ProofFlow: A Dependency Graph Approach to Faithful Proof Autoformalization

Cabral, Rafael, Do, Tuan Manh, Yu, Xuejun, Tai, Wai Ming, Feng, Zijin, Shen, Xin

Proof autoformalization, the task of translating natural language theorems and proofs into machine-verifiable code, is a critical step for integrating large language models into rigorous mathematical workflows. Current approaches focus on producing executable code, but they frequently fail to preserve the semantic meaning and logical structure of the original human-written argument. To address this, we introduce ProofFlow, a novel pipeline that treats structural fidelity as a primary objective. ProofFlow first constructs a directed acyclic graph (DAG) to map the logical dependencies between proof steps. Then, it employs a novel lemma-based approach to systematically formalize each step as an intermediate lemma, preserving the logical structure of the original argument. To facilitate evaluation, we present a new benchmark of 184 undergraduate-level problems, manually annotated with step-by-step solutions and logical dependency graphs, and introduce ProofScore, a new composite metric to evaluate syntactic correctness, semantic faithfulness, and structural fidelity. Experimental results show our pipeline sets a new state-of-the-art for autoformalization, achieving a ProofScore of 0.545, substantially exceeding baselines like full-proof formalization (0.123), which processes the entire proof at once, and step-proof formalization (0.072), which handles each step independently. Our pipeline, benchmark, and score metric are open-sourced to encourage further progress at https://github.com/Huawei-AI4Math/ProofFlow.

artificial intelligence, large language model, logic & formal reasoning, (16 more...)

2510.15981

Genre:

Workflow (1.00)
Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.61)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)