Goto

Collaborating Authors

 Logic & Formal Reasoning


AIhub blogpost highlights 2024

AIHub

Over the course of the year, we've had the pleasure of working with many talented researchers from across the globe. As 2024 draws to a close, we take a look back at some of the excellent blog posts from our contributors. Jose Gonzรกlez-Abad reports on work on statistical downscaling for climate models, and introduces a framework which encodes physical constraints to improve consistency and robustness. Diogo Carvalho writes about research on hierarchical reinforcement learning, work that won him and his colleagues a best paper award at ECAI 2023. Yi Chen, Ramya Korlakai Vinayak and Babak Hassibi write about crowdsourced clustering: finding clusters in a dataset with unlabelled items by querying pairs of items for similarity.


In Memoriam: E. Allen Emerson

Communications of the ACM

E. Allen Emerson was the first graduate student of Edmund M. Clarke at Harvard University. After discussing several ideas for Allen's dissertation, they identified a promising candidate: verifying a finite-state system against a formal specification. According to Martha Clarke, Edmund's widow, it was during a walk across Harvard Yard that they decided to call it "model checking." Emerson received his Ph.D. in applied mathematics for this work in 1981. Twenty-five years later, he and Clarke (along with Joseph Sifakis) shared the ACM A.M. Turing Award in 2007 for this and related work.


Model Checking and Verification of Synchronisation Properties of Cobot Welding

arXiv.org Artificial Intelligence

This paper describes use of model checking to verify synchronisation properties of an industrial welding system consisting of a cobot arm and an external turntable. The robots must move synchronously, but sometimes get out of synchronisation, giving rise to unsatisfactory weld qualities in problem areas, such as around corners. These mistakes are costly, since time is lost both in the robotic welding and in manual repairs needed to improve the weld. Verification of the synchronisation properties has shown that they are fulfilled as long as assumptions of correctness made about parts outside the scope of the model hold, indicating limitations in the hardware. These results have indicated the source of the problem, and motivated a re-calibration of the real-life system. This has drastically improved the welding results, and is a demonstration of how formal methods can be useful in an industrial setting.


Formalizing Stateful Behavior Trees

arXiv.org Artificial Intelligence

Behavior Trees (BTs) are high-level controllers that are useful in a variety of planning tasks and are gaining traction in robotic mission planning. As they gain popularity in safety-critical domains, it is important to formalize their syntax and semantics, as well as verify properties for them. In this paper, we formalize a class of BTs we call Stateful Behavior Trees (SBTs) that have auxiliary variables and operate in an environment that can change over time. SBTs have access to persistent shared memory (often known as a blackboard) that keeps track of these auxiliary variables. We demonstrate that SBTs are equivalent in computational power to Turing Machines when the blackboard can store mathematical (i.e., unbounded) integers. We further identify syntactic assumptions where SBTs have computational power equivalent to finite state automata, specifically where the auxiliary variables are of finitary types. We present a domain specific language (DSL) for writing SBTs and adapt the tool BehaVerify for use with this DSL. This new DSL in BehaVerify supports interfacing with popular BT libraries in Python, and also provides generation of Haskell code and nuXmv models, the latter of which is used for model checking temporal logic specifications for the SBTs. We include examples and scalability results where BehaVerify outperforms another verification tool by a factor of 100.


ROSMonitoring 2.0: Extending ROS Runtime Verification to Services and Ordered Topics

arXiv.org Artificial Intelligence

Formal verification of robotic applications presents challenges due to their hybrid nature and distributed architecture. This paper introduces ROSMonitoring 2.0, an extension of ROSMonitoring designed to facilitate the monitoring of both topics and services while considering the order in which messages are published and received. The framework has been enhanced to support these novel features for ROS1 -- and partially ROS2 environments -- offering improved real-time support, security, scalability, and interoperability. We discuss the modifications made to accommodate these advancements and present results obtained from a case study involving the runtime monitoring of specific components of a fire-fighting Uncrewed Aerial Vehicle (UAV).


LTLf+ and PPLTL+: Extending LTLf and PPLTL to Infinite Traces

arXiv.org Artificial Intelligence

We introduce LTLf+ and PPLTL+, two logics to express properties of infinite traces, that are based on the linear-time temporal logics LTLf and PPLTL on finite traces. LTLf+/PPLTL+ use levels of Manna and Pnueli's LTL safety-progress hierarchy, and thus have the same expressive power as LTL. However, they also retain a crucial characteristic of the reactive synthesis problem for the base logics: the game arena for strategy extraction can be derived from deterministic finite automata (DFA). Consequently, these logics circumvent the notorious difficulties associated with determinizing infinite trace automata, typical of LTL reactive synthesis. We present DFA-based synthesis techniques for LTLf+/PPLTL+, and show that synthesis is 2EXPTIME-complete for LTLf+ (matching LTLf) and EXPTIME-complete for PPLTL+ (matching PPLTL). Notably, while PPLTL+ retains the full expressive power of LTL, reactive synthesis is EXPTIME-complete instead of 2EXPTIME-complete. The techniques are also adapted to optimally solve satisfiability, validity, and model-checking, to get EXPSPACE-complete for LTLf+ (extending a recent result for the guarantee level using LTLf), and PSPACE-complete for PPLTL+.


ION-C: Integration of Overlapping Networks via Constraints

arXiv.org Machine Learning

In many causal learning problems, variables of interest are often not all measured over the same observations, but are instead distributed across multiple datasets with overlapping variables. Tillman et al. (2008) presented the first algorithm for enumerating the minimal equivalence class of ground-truth DAGs consistent with all input graphs by exploiting local independence relations, called ION. In this paper, this problem is formulated as a more computationally efficient answer set programming (ASP) problem, which we call ION-C, and solved with the ASP system clingo. The ION-C algorithm was run on random synthetic graphs with varying sizes, densities, and degrees of overlap between subgraphs, with overlap having the largest impact on runtime, number of solution graphs, and agreement within the output set. To validate ION-C on real-world data, we ran the algorithm on overlapping graphs learned from data from two successive iterations of the European Social Survey (ESS), using a procedure for conducting joint independence tests to prevent inconsistencies in the input.


Bridging the Gap: Representation Spaces in Neuro-Symbolic AI

arXiv.org Artificial Intelligence

However, although the cooperation between these two seems natural, the difference in their representation is obviously not negligible. Prof. Henry Kautz proposed a taxonomy of Neuro-Symbolic Systems in the AAAI 2020. In addition, many researchers have conducted relevant reviews of the recent neuro-symbolic AI from different perspectives. As Fig.1 shows, Acharya et al. [1] proposed a new classification method, which classified and discussed the application of existing neuro-symbolic AI by the role of neural and symbolic parts: learning for reasoning, reasoning for Learning, and learning-reasoning. Garcez et al. [73] proposed a taxonomy that includes sequential, nested, cooperative, and compiled neuro-symbolic AI based on the six types introduced by Henry Kautz. In addition, some reviews focus on cross-field integration and applications. For example, Berlot-Attwell [27] reviewed neuro-symbolic VQA (visual question answering) from the perspectives of AGI (artificial general intelligence) desiderata. Marra [128] conducted a comprehensive review on integrating neuro-symbolic and statistical relational artificial intelligence based on seven dimensions.


Assured Automatic Programming via Large Language Models

arXiv.org Artificial Intelligence

With the advent of AI-based coding engines, it is possible to convert natural language requirements to executable code in standard programming languages. However, AI-generated code can be unreliable, and the natural language requirements driving this code may be ambiguous. In other words, the intent may not be accurately captured in the code generated from AI-coding engines like Copilot. The goal of our work is to discover the programmer intent, while generating code which conforms to the intent and a proof of this conformance. Our approach to intent discovery is powered by a novel repair engine called program-proof co-evolution, where the object of repair is a tuple (code, logical specification, test) generated by an LLM from the same natural language description. The program and the specification capture the initial operational and declarative description of intent, while the test represents a concrete, albeit partial, understanding of the intent. Our objective is to achieve consistency between the program, the specification, and the test by incrementally refining our understanding of the user intent. Reaching consistency through this repair process provides us with a formal, logical description of the intent, which is then translated back into natural language for the developer's inspection. The resultant intent description is now unambiguous, though expressed in natural language. We demonstrate how the unambiguous intent discovered through our approach increases the percentage of verifiable auto-generated programs on a recently proposed dataset in the Dafny programming language.


Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically

arXiv.org Artificial Intelligence

Mathematical theorem proving is an important testbed for large language models' deep and abstract reasoning capability. This paper focuses on improving LLMs' ability to write proofs in formal languages that permit automated proof verification/evaluation. Most previous results provide human-written lemmas to the theorem prover, which is an arguably oversimplified setting that does not sufficiently test the provers' planning and decomposition capabilities. Instead, we work in a more natural setup where the lemmas that are directly relevant to the theorem are not given to the theorem prover at test time. We design an RL-based training algorithm that encourages the model to decompose a theorem into lemmas, prove the lemmas, and then prove the theorem by using the lemmas. Our reward mechanism is inspired by how mathematicians train themselves: even if a theorem is too challenging to be proved by the current model, a positive reward is still given to the model for any correct and novel lemmas that are proposed and proved in this process. During training, our model proposes and proves lemmas that are not in the training dataset. In fact, these newly-proposed correct lemmas consist of 37.7% of the training replay buffer when we train on the dataset extracted from Archive of Formal Proofs (AFP). The model trained by our RL algorithm outperforms that trained by supervised finetuning, improving the pass rate from 40.8% to 45.5% on AFP test set, and from 36.5% to 39.5% on an out-of-distribution test set.