Goto

Collaborating Authors

 Logic & Formal Reasoning


Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arXiv.org Artificial Intelligence

We introduce and define a family of approaches to AI safety, collectively referred to as guaranteed safe (GS) AI. These Ensuring that AI systems reliably and robustly approaches aim to provide high-assurance quantitative guarantees avoid harmful or dangerous behaviours is a crucial about the safety of an AI system's behaviour through challenge, especially for AI systems with a the use of three core components -- a formal safety specification, high degree of autonomy and general intelligence, a world model, and a verifier. We will argue that this or systems used in safety-critical contexts. In strategy is both promising and underexplored, and contrast it this position paper, we will introduce and define with other ongoing efforts in AI safety. We will also outline a family of approaches to AI safety, which we several ongoing avenues of research within the broader GS will refer to as guaranteed safe (GS) AI. The core research agenda, identify some of their core difficulties, and feature of these approaches is that they aim to produce discuss approaches for overcoming these difficulties. Central AI systems which are equipped with highassurance examples of agendas which fall under the GS AI family quantitative safety guarantees. This include Szegedy (2020); Wing (2021); Seshia et al. (2022); is achieved by the interplay of three core components: Russell (2022); Tegmark & Omohundro (2023); 'davidad' a world model (which provides a mathematical Dalrymple (2024); Bengio (2024).


Geospatial Trajectory Generation via Efficient Abduction: Deployment for Independent Testing

arXiv.org Artificial Intelligence

The ability to generate artificial human movement patterns while meeting location and time constraints is an important problem in the security community, particularly as it enables the study of the analog problem of detecting such patterns while maintaining privacy. We frame this problem as an instance of abduction guided by a novel parsimony function represented as an aggregate truth value over an annotated logic program. This approach has the added benefit of affording explainability to an analyst user. By showing that any subset of such a program can provide a lower bound on this parsimony requirement, we are able to abduce movement trajectories efficiently through an informed (i.e., A*) search. We describe how our implementation was enhanced with the application of multiple techniques in order to be scaled and integrated with a cloud-based software stack that included bottom-up rule learning, geolocated knowledge graph retrieval/management, and interfaces with government systems for independently conducted government-run tests for which we provide results. We also report on our own experiments showing that we not only provide exact results but also scale to very large scenarios and provide realistic agent trajectories that can go undetected by machine learning anomaly detectors.


Verifiably Following Complex Robot Instructions with Foundation Models

arXiv.org Artificial Intelligence

Enabling mobile robots to follow complex natural language instructions is an important yet challenging problem. People want to flexibly express constraints, refer to arbitrary landmarks and verify behavior when instructing robots. Conversely, robots must disambiguate human instructions into specifications and ground instruction referents in the real world. We propose Language Instruction grounding for Motion Planning (LIMP), an approach that enables robots to verifiably follow expressive and complex open-ended instructions in real-world environments without prebuilt semantic maps. LIMP constructs a symbolic instruction representation that reveals the robot's alignment with an instructor's intended motives and affords the synthesis of robot behaviors that are correct-by-construction. We perform a large scale evaluation and demonstrate our approach on 150 instructions in five real-world environments showing the generality of our approach and the ease of deployment in novel unstructured domains. In our experiments, LIMP performs comparably with state-of-the-art LLM task planners and LLM code-writing planners on standard open vocabulary tasks and additionally achieves 79\% success rate on complex spatiotemporal instructions while LLM and Code-writing planners both achieve 38\%. See supplementary materials and demo videos at https://robotlimp.github.io


On the Equivalence between Logic Programming and SETAF

arXiv.org Artificial Intelligence

A framework with sets of attacking arguments(SETAF) is an extension of the well-known Dung's Abstract Argumentation Frameworks (AAF s) that allows joint attacks on arguments. In this paper, we provide a translation from Normal Logic Programs (NLPs) to SETAFs and vice versa, from SETAFs to NLPs. We show that there is pairwise equivalence between their semantics, including the equivalence between L-stable and semi-stable semantics. Furthermore, for a class of NLPs called Redundancy-Free Atomic Logic Programs (RFALPs), there is also a structural equivalence as these back-and-forth translations are each other's inverse. Then, we show that RFALPs are as expressive as NLPs by transforming any NLP into an equivalent RFALP through a series of program transformations already known in the literature. We also show that these program transformations are confluent, meaning that every NLP will be transformed into a unique RFALP. The results presented in this paper enhance our understanding that NLPs and SETAFs are essentially the same formalism.


Towards Automated Functional Equation Proving: A Benchmark Dataset and A Domain-Specific In-Context Agent

arXiv.org Artificial Intelligence

Automated Theorem Proving (ATP) faces significant challenges due to the vast action space and the computational demands of proof generation. Recent advances have utilized Large Language Models (LLMs) for action selection in ATP, but these methods often require substantial computational resources. This study introduces the Functional Equation Automated Solver (FEAS), an agent that builds on the COPRA in-context learning framework within the Lean environment. FEAS innovates by refining prompt generation and response parsing mechanisms, integrating domain-specific heuristics for functional equations, and introducing the FunEq dataset--a rigorously curated collection of functional equation problems categorized into three difficulty levels. The agent's performance is evaluated against established baselines using this dataset, demonstrating improvements in theorem proving accuracy, particularly with the integration of functional equation-specific heuristics. Our results highlight the effectiveness of FEAS in generating and formalizing high-level proof strategies into Lean proofs, emphasizing the potential of tailored approaches in domain-specific ATP challenges.


LOGIC-LM++: Multi-Step Refinement for Symbolic Formulations

arXiv.org Artificial Intelligence

In this paper we examine the limitations of Large Language Models (LLMs) for complex reasoning tasks. Although recent works have started to employ formal languages as an intermediate representation for reasoning tasks, they often face challenges in accurately generating and refining these formal specifications to ensure correctness. To address these issues, this paper proposes Logic-LM++, an improvement on Logic-LM . It uses the ability of LLMs to do pairwise comparisons, allowing the evaluation of the refinements suggested by the LLM. The paper demonstrates that Logic-LM++ outperforms Logic-LM and other contemporary techniques across natural language reasoning tasks on three datasets, FOLIO, ProofWriter and AR-LSAT, with an average improvement of 18.5% on standard prompting, 12.3% on chain of thought prompting and 5% on Logic-LM.


Phase-Bounded Broadcast Networks over Topologies of Communication

arXiv.org Artificial Intelligence

We study networks of processes that all execute the same finite state protocol and that communicate through broadcasts. The processes are organized in a graph (a topology) and only the neighbors of a process in this graph can receive its broadcasts. The coverability problem asks, given a protocol and a state of the protocol, whether there is a topology for the processes such that one of them (at least) reaches the given state. This problem is undecidable. We study here an under-approximation of the problem where processes alternate a bounded number of times $k$ between phases of broadcasting and phases of receiving messages. We show that, if the problem remains undecidable when $k$ is greater than 6, it becomes decidable for $k=2$, and EXPSPACE-complete for $k=1$. Furthermore, we show that if we restrict ourselves to line topologies, the problem is in $P$ for $k=1$ and $k=2$.


IID Relaxation by Logical Expressivity: A Research Agenda for Fitting Logics to Neurosymbolic Requirements

arXiv.org Artificial Intelligence

Neurosymbolic background knowledge and the expressivity required of its logic can break Machine Learning assumptions about data Independence and Identical Distribution. In this position paper we propose to analyze IID relaxation in a hierarchy of logics that fit different use case requirements. We discuss the benefits of exploiting known data dependencies and distribution constraints for Neurosymbolic use cases and argue that the expressivity required for this knowledge has implications for the design of underlying ML routines. This opens a new research agenda with general questions about Neurosymbolic background knowledge and the expressivity required of its logic.


Learning Formal Mathematics From Intrinsic Motivation

arXiv.org Artificial Intelligence

How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model. Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof search. Our agent targets generating hard but provable conjectures - a moving target, since its own theorem proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs.


AIhub monthly digest: June 2024 – network resource allocation, protein structure prediction, and a Ge'ez-Amharic-English dataset

AIHub

Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we hear about a Ge'ez-Amharic-English dataset, meet AAAI Fellow Mausam, and learn about network resource allocation. Each year the AAAI recognizes a group of individuals who have made significant, sustained contributions to the field of artificial intelligence by appointing them as Fellows. Over the course of the next few months, we'll be talking to some of the 2024 AAAI Fellows. In the first interview in the series, we met Professor Mausam and found out about his research, career path, mentorship, and why it is important to add some creative pursuits to your life.