Goto

Collaborating Authors

 reasoning tool


A Beautiful Mind: Principles and Strategies for AI-Augmented Human Reasoning

Koon, Sean

arXiv.org Artificial Intelligence

T he past century ha s witnessed incredible technological change . The many benefits and conveniences o f technology are accompanied by new complexities and human challenges that affect work, home, social, and civic realms. Th ere is a w idening gap "between a growing complexity of our own making and a lagging development of our own capacities" (Botkin et al., 1998) . Now, artificial intelligence promises to increase the rate of scientific discovery and innovation exponentially, creating new changes and p otential complexities to which humans must adapt (Friedman, 2017) . On the other hand, new AI tools, especially generative AI models, may help people to engage with the growing volume and complexity of information in their reasoning tasks such as decisionmaking and problem solving.


RLSF: Reinforcement Learning via Symbolic Feedback

Jha, Piyush, Jana, Prithwish, Arora, Arnav, Ganesh, Vijay

arXiv.org Artificial Intelligence

In recent years, large language models (LLMs) have had a dramatic impact on various sub-fields of AI, most notably on natural language understanding tasks. However, there is widespread agreement that the logical reasoning capabilities of contemporary LLMs are, at best, fragmentary (i.e., may work well on some problem instances but fail dramatically on others). While traditional LLM fine-tuning approaches (e.g., those that use human feedback) do address this problem to some degree, they suffer from many issues, including unsound black-box reward models, difficulties in collecting preference data, and sparse scalar reward values. To address these challenges, we propose a new training/fine-tuning paradigm we refer to as Reinforcement Learning via Symbolic Feedback (RLSF), which is aimed at enhancing the reasoning capabilities of LLMs. In the RLSF setting, the LLM that is being trained/fine-tuned is considered as the RL agent, while the environment is allowed access to reasoning or domain knowledge tools (e.g., solvers, algebra systems). Crucially, in RLSF, these reasoning tools can provide feedback to the LLMs via poly-sized certificates (e.g., proofs), that characterize errors in the LLM-generated object with respect to some correctness specification. The ability of RLSF-based training/fine-tuning to leverage certificate-generating symbolic tools enables sound fine-grained (token-level) reward signals to LLMs, and thus addresses the limitations of traditional reward models mentioned above. Via extensive evaluations, we show that our RLSF-based fine-tuning of LLMs outperforms traditional approaches on two different applications, namely, program synthesis from natural language pseudo-code to programming language (C++) and solving the Game of 24.


Solving with GeoGebra Discovery an Austrian Mathematics Olympiad problem: Lessons Learned

Ariño-Morera, Belén, Kovács, Zoltán, Recio, Tomás, Tolmos, Piedad

arXiv.org Artificial Intelligence

We address, through the automated reasoning tools in GeoGebra Discovery, a problem from a regional phase of the Austrian Mathematics Olympiad 2023. Trying to solve this problem gives rise to four different kind of feedback: the almost instantaneous, automated solution of the proposed problem; the measure of its complexity, according to some recent proposals; the automated discovery of a generalization of the given assertion, showing that the same statement is true over more general polygons than those mentioned in the problem; and the difficulties associated to the analysis of the surprising and involved high number of degenerate cases that appear when using the LocusEquation command in this problem. In our communication we will describe and reflect on these diverse issues, enhancing its exemplar role for showing some of the advantages, problems, and current fields of development of GeoGebra Discovery.


Showing Proofs, Assessing Difficulty with GeoGebra Discovery

Kovács, Zoltán, Recio, Tomás, Vélez, M. Pilar

arXiv.org Artificial Intelligence

See [1] for a general description and references. The goal of the current contribution is to present some ongoing work regarding two different, but related, important improvements of GeoGebra Discovery. One, to visualize the different steps that GG Discovery performs with a given geometric statement until it declares its truth (or failure). Two, to test, through different elementary examples, the suitability of an original proposal to evaluate the interest, complexity or difficulty of a given statement. Let us advance that our proposal involves the notion of syzygy of a set of polynomials. The relevance of showing details about each of the steps performed by our automated reasoning algorithms implemented in GG Discovery is quite evident. In fact, as a consequence of the result in [2], describing the formalization of the arithmetization of Euclidean plane geometry, proofs of geometric statements obtained using algebraic geometry algorithms are also valid on the synthetic geometry realm.


Generating and Exploiting Automated Reasoning Proof Certificates

Communications of the ACM

Automated reasoning refers to a set of tools and techniques for automatically proving or disproving formulas in mathematical logic.35 It has many applications in computer science--for example, questions about the existence of bugs or security vulnerabilities in hardware or software systems can often be phrased as logical formulas, or verification conditions, whose validity can then be proved or disproved using automated reasoning techniques, a process known as formal verification.15,26 When successful, formal verification can guarantee freedom from certain kinds of design errors, an outcome that is otherwise extremely difficult to achieve. Driven by such potential benefits, the past couple of decades have seen a dramatic improvement in the performance and capabilities of automated reasoning tools, with a corresponding explosion of use cases, including formal verification, automated test-case generation, program analysis, program synthesis, and many more.5,37,38 These applications rely crucially on automated reasoning tools producing correct results.


Open Geometry Prover Community Project

Baeta, Nuno, Quaresma, Pedro

arXiv.org Artificial Intelligence

Mathematical proof is undoubtedly the cornerstone of mathematics. The emergence, in the last years, of computing and reasoning tools, in particular automated geometry theorem provers, has enriched our experience with mathematics immensely. To avoid disparate efforts,the Open Geometry Prover Community Project aims at the integration of the different efforts for the development of geometry automated theorem provers, under a common "umbrella". In this article the necessary steps to such integration are specified and the current implementation of some of those steps is described.


Computer-supported Analysis of Positive Properties, Ultrafilters and Modal Collapse in Variants of G\"odel's Ontological Argument

Benzmüller, Christoph, Fuenmayor, David

arXiv.org Artificial Intelligence

Three variants of Kurt G\"odel's ontological argument, as proposed byDana Scott, C. Anthony Anderson and Melvin Fitting, are encoded and rigorously assessed on the computer. In contrast to Scott's version of G\"odel's argument, the two variants contributed by Anderson and Fitting avoid modal collapse. Although they appear quite different on a cursory reading, they are in fact closely related, as our computer-supported formal analysis (conducted in the proof assistant system Isabelle/HOL) reveals. Key to our formal analysis is the utilization of suitably adapted notions of (modal) ultrafilters, and a careful distinction between extensions and intensions of positive properties.


Failure Handling In a Planning Framework

Karapinar, Sertac (Istanbul Technical University) | Sariel-Talay, Sanem (Istanbul Technical University)

AAAI Conferences

When an agent plans a sequence of actions, some unexpected events may occur during the execution of these actions. These unexpected events may prevent the agent to replan and achieve its goal. In this work, our purpose is to recover from plan execution failures by reasoning the causes of these faulties. We combine the TLPlan forward chaining temporal planner with the PROBCOG reasoning tool in order to handle failures. It is also quite important to decide whether the failure we are dealing with is permanent. We propose that inferring some properties of the failure source helps us handle failures and determine the failure types.