Logic & Formal Reasoning
Proof Recommendation System for the HOL4 Theorem Prover
Dekhil, Nour, Rashid, Adnan, Tahar, Sofiene
We experimented with various transformer-based language models, such as BERT [9], RoBERTa [10], and T5 [11] for these datasets to identify the most effective model based on our evaluation. After splitting the restructured datasets into a 90-10 ratio for training and testing, we proceeded to train the selected models (block highlighted in yellow) using a grid search of hyperparameters optimization. Given the multitude of possible tactics available at each proof state, we chose to provide multiple recommendations for the next proof step. To assess the accuracy of these recommendations (block highlighted in green), we use the n-correctness rate, which measures the likelihood that a correct tactic from the testing dataset is among the top-n recommended tactics, where n signifies the number of recommended tactics evaluated against the correct tactic. We found out that RoBERTa demonstrated superior performance across most cases for n = 7.
A Hybrid Deep Learning and Model-Checking Framework for Accurate Brain Tumor Detection and Validation
Fatimi, Lahcen El, Elfatimi, Elhoucine, Bouchaneb, Hanifa
Model checking is an automatic technique for verifying the correctness properties of safety-critical reactive systems. This method has been successfully applied to find subtle errors in complex systems. Model checking techniques have a wide range of application domains, among which large-scale distributed systems [1-3], signal [4], and medical images analysis [5-8]. The research related to the last topic is still ongoing looking for the perfect (precise, complete, simple) approach for analyzing medical images. The use of model checking is relatively recent, in particular regarding the verification of the analysis of medical images. In this domain, model checking in medical images has shown to be a promising application that can significantly facilitate the work of professionals. What motivates us in this study, considering that model checking is increasingly used in testing to check whether a system model satisfies a property, is to take model checking in its usual role to take on more advanced roles in medical image analysis by applying model-checking logic to medical images and detection of tumors in addition to validation of properties through tests or case studies.
AIhub blogpost highlights 2024
Over the course of the year, we've had the pleasure of working with many talented researchers from across the globe. As 2024 draws to a close, we take a look back at some of the excellent blog posts from our contributors. Jose González-Abad reports on work on statistical downscaling for climate models, and introduces a framework which encodes physical constraints to improve consistency and robustness. Diogo Carvalho writes about research on hierarchical reinforcement learning, work that won him and his colleagues a best paper award at ECAI 2023. Yi Chen, Ramya Korlakai Vinayak and Babak Hassibi write about crowdsourced clustering: finding clusters in a dataset with unlabelled items by querying pairs of items for similarity.
Probabilistic Mission Design in Neuro-Symbolic Systems
Kohaut, Simon, Flade, Benedict, Ochs, Daniel, Dhami, Devendra Singh, Eggert, Julian, Kersting, Kristian
Advanced Air Mobility (AAM) is a growing field that demands accurate modeling of legal concepts and restrictions in navigating intelligent vehicles. In addition, any implementation of AAM needs to face the challenges posed by inherently dynamic and uncertain human-inhabited spaces robustly. Nevertheless, the employment of Unmanned Aircraft Systems (UAS) beyond visual line of sight (BVLOS) is an endearing task that promises to enhance significantly today's logistics and emergency response capabilities. To tackle these challenges, we present a probabilistic and neuro-symbolic architecture to encode legal frameworks and expert knowledge over uncertain spatial relations and noisy perception in an interpretable and adaptable fashion. More specifically, we demonstrate Probabilistic Mission Design (ProMis), a system architecture that links geospatial and sensory data with declarative, Hybrid Probabilistic Logic Programs (HPLP) to reason over the agent's state space and its legality. As a result, ProMis generates Probabilistic Mission Landscapes (PML), which quantify the agent's belief that a set of mission conditions is satisfied across its navigation space. Extending prior work on ProMis' reasoning capabilities and computational characteristics, we show its integration with potent machine learning models such as Large Language Models (LLM) and Transformer-based vision models. Hence, our experiments underpin the application of ProMis with multi-modal input data and how our method applies to many important AAM scenarios.
StaR Maps: Unveiling Uncertainty in Geospatial Relations
Kohaut, Simon, Flade, Benedict, Eggert, Julian, Dhami, Devendra Singh, Kersting, Kristian
The growing complexity of intelligent transportation systems and their applications in public spaces has increased the demand for expressive and versatile knowledge representation. While various mapping efforts have achieved widespread coverage, including detailed annotation of features with semantic labels, it is essential to understand their inherent uncertainties, which are commonly underrepresented by the respective geographic information systems. Hence, it is critical to develop a representation that combines a statistical, probabilistic perspective with the relational nature of geospatial data. Further, such a representation should facilitate an honest view of the data's accuracy and provide an environment for high-level reasoning to obtain novel insights from task-dependent queries. Our work addresses this gap in two ways. First, we present Statistical Relational Maps (StaR Maps) as a representation of uncertain, semantic map data. Second, we demonstrate efficient computation of StaR Maps to scale the approach to wide urban spaces. Through experiments on real-world, crowd-sourced data, we underpin the application and utility of StaR Maps in terms of representing uncertain knowledge and reasoning for complex geospatial information.
Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI
Liang, Shizhe, Zhang, Wei, Zhong, Tianyang, Liu, Tianming
This paper presents a comprehensive overview on the applications of artificial intelligence (AI) in mathematical research, highlighting the transformative role AI has begun to play in this domain. Traditionally, AI advancements have heavily relied on theoretical foundations provided by mathematics and statistics. However, recent developments in AI, particularly in reinforcement learning (RL) and large language models (LLMs), have demonstrated the potential for AI to contribute back to mathematics by offering flexible algorithmic frameworks and powerful inductive reasoning capabilities that support various aspects of mathematical research. This survey aims to establish a bridge between AI and mathematics, providing insights into the mutual benefits and fostering deeper interdisciplinary understanding. In particular, we argue that while current AI and LLMs may struggle with complex deductive reasoning, their "inherent creativity", the ability to generate outputs at high throughput based on recognition of shallow patterns, holds significant potential to support and inspire mathematical research. This creative capability, often overlooked, could be the key to unlocking new perspectives and methodologies in mathematics. Furthermore, we address the lack of cross-disciplinary communication: mathematicians may not fully comprehend the latest advances in AI, while AI researchers frequently prioritize benchmark performance over real-world applications in frontier mathematical research. This paper seeks to close that gap, offering a detailed exploration of AI fundamentals, its strengths, and its emerging applications in the mathematical sciences.
Hybrid Many-Objective Optimization in Probabilistic Mission Design for Compliant and Effective UAV Routing
Kohaut, Simon, Hohmann, Nikolas, Brulin, Sebastian, Flade, Benedict, Eggert, Julian, Olhofer, Markus, Adamy, Jürgen, Dhami, Devendra Singh, Kersting, Kristian
Advanced Aerial Mobility encompasses many outstanding applications that promise to revolutionize modern logistics and pave the way for various public services and industry uses. However, throughout its history, the development of such systems has been impeded by the complexity of legal restrictions and physical constraints. While airspaces are often tightly shaped by various legal requirements, Unmanned Aerial Vehicles (UAV) must simultaneously consider, among others, energy demands, signal quality, and noise pollution. In this work, we address this challenge by presenting a novel architecture that integrates methods of Probabilistic Mission Design (ProMis) and Many-Objective Optimization for UAV routing. Hereby, our framework is able to comply with legal requirements under uncertainty while producing effective paths that minimize various physical costs a UAV needs to consider when traversing human-inhabited spaces. To this end, we combine hybrid probabilistic first-order logic for spatial reasoning with mixed deterministic-stochastic route optimization, incorporating physical objectives such as energy consumption and radio interference with a logical, probabilistic model of legal requirements. We demonstrate the versatility and advantages of our system in a large-scale empirical evaluation over real-world, crowd-sourced data from a map extract from the city of Paris, France, showing how a network of effective and compliant paths can be formed.
Exploring Flexible Scenario Generation in Godot Simulator
Cyber-physical systems (CPS) combine cyber and physical components engineered to make decisions and interact within dynamic environments. Ensuring the safety of CPS is of great importance, requiring extensive testing across diverse and complex scenarios. To generate as many testing scenarios as possible, previous efforts have focused on describing scenarios using formal languages to generate scenes. In this paper, we introduce an alternative approach: reconstructing scenes inside the open-source game engine, Godot. We have developed a pipeline that enables the reconstruction of testing scenes directly from provided images of scenarios. These reconstructed scenes can then be deployed within simulated environments to assess a CPS. This approach offers a scalable and flexible solution for testing CPS in realistic environments.
ASP-based Multi-shot Reasoning via DLV2 with Incremental Grounding
Calimeri, Francesco, Ianni, Giovambattista, Pacenza, Francesco, Perri, Simona, Zangari, Jessica
DLV2 is an AI tool for Knowledge Representation and Reasoning which supports Answer Set Programming (ASP) - a logic-based declarative formalism, successfully used in both academic and industrial applications. Given a logic program modelling a computational problem, an execution of DLV2 produces the so-called answer sets that correspond one-to-one to the solutions to the problem at hand. The computational process of DLV2 relies on the typical Ground & Solve approach where the grounding step transforms the input program into a new, equivalent ground program, and the subsequent solving step applies propositional algorithms to search for the answer sets. Recently, emerging applications in contexts such as stream reasoning and event processing created a demand for multi-shot reasoning: here, the system is expected to be reactive while repeatedly executed over rapidly changing data. In this work, we present a new incremental reasoner obtained from the evolution of DLV2 towards iterated reasoning. Rather than restarting the computation from scratch, the system remains alive across repeated shots, and it incrementally handles the internal grounding process. At each shot, the system reuses previous computations for building and maintaining a large, more general ground program, from which a smaller yet equivalent portion is determined and used for computing answer sets. Notably, the incremental process is performed in a completely transparent fashion for the user. We describe the system, its usage, its applicability and performance in some practically relevant domains. Under consideration in Theory and Practice of Logic Programming (TPLP).
Understanding the Logic of Direct Preference Alignment through Logic
Richardson, Kyle, Srikumar, Vivek, Sabharwal, Ashish
Recent direct preference alignment algorithms (DPA), such as DPO, have shown great promise in aligning large language models to human preferences. While this has motivated the development of many new variants of the original DPO loss, understanding the differences between these recent proposals, as well as developing new DPA loss functions, remains difficult given the lack of a technical and conceptual framework for reasoning about the underlying semantics of these algorithms. In this paper, we attempt to remedy this by formalizing DPA losses in terms of discrete reasoning problems. Specifically, we ask: Given an existing DPA loss, can we systematically derive a symbolic expression that characterizes its semantics? How do the semantics of two losses relate to each other? We propose a novel formalism for characterizing preference losses for single model and reference model based approaches, and identify symbolic forms for a number of commonly used DPA variants. Further, we show how this formal view of preference learning sheds new light on both the size and structure of the DPA loss landscape, making it possible to not only rigorously characterize the relationships between recent loss proposals but also to systematically explore the landscape and derive new loss functions from first principles. We hope our framework and findings will help provide useful guidance to those working on human AI alignment.