Goto

Collaborating Authors

 Logic & Formal Reasoning


Genesis: Towards the Automation of Systems Biology Research

arXiv.org Artificial Intelligence

The cutting edge of applying AI to science is the closed-loop automation of scientific research: robot scientists. We have previously developed two robot scientists: `Adam' (for yeast functional biology), and `Eve' (for early-stage drug design)). We are now developing a next generation robot scientist Genesis. With Genesis we aim to demonstrate that an area of science can be investigated using robot scientists unambiguously faster, and at lower cost, than with human scientists. Here we report progress on the Genesis project. Genesis is designed to automatically improve system biology models with thousands of interacting causal components. When complete Genesis will be able to initiate and execute in parallel one thousand hypothesis-led closed-loop cycles of experiment per-day. Here we describe the core Genesis hardware: the one thousand computer-controlled $\mu$-bioreactors. For the integrated Mass Spectrometry platform we have developed AutonoMS, a system to automatically run, process, and analyse high-throughput experiments. We have also developed Genesis-DB, a database system designed to enable software agents access to large quantities of structured domain information. We have developed RIMBO (Revisions for Improvements of Models in Biology Ontology) to describe the planned hundreds of thousands of changes to the models. We have demonstrated the utility of this infrastructure by developed two relational learning bioinformatic projects. Finally, we describe LGEM+ a relational learning system for the automated abductive improvement of genome-scale metabolic models.


Intensional FOL: Many-Sorted Extension

arXiv.org Artificial Intelligence

The concepts used in IFOL have associated to them a list of sorted attributes, and the sorts are the intensional concepts as well. The requirement to extend the unsorted IFOL (Intensional FOL) to many-sorted IFOL is mainly based on the fact that a natural language is implicitly many-sorted and that we intend to use IFOL to support applications that use natural languages. Thus, the proposed version of many-sorted IFOL is just the completion of this conceptual feature of the IFOL.


Formal Verification and Control with Conformal Prediction

arXiv.org Artificial Intelligence

In this survey, we design formal verification and control algorithms for autonomous systems with practical safety guarantees using conformal prediction (CP), a statistical tool for uncertainty quantification. We focus on learning-enabled autonomous systems (LEASs) in which the complexity of learning-enabled components (LECs) is a major bottleneck that hampers the use of existing model-based verification and design techniques. Instead, we advocate for the use of CP, and we will demonstrate its use in formal verification, systems and control theory, and robotics. We argue that CP is specifically useful due to its simplicity (easy to understand, use, and modify), generality (requires no assumptions on learned models and data distributions, i.e., is distribution-free), and efficiency (real-time capable and accurate). We pursue the following goals with this survey. First, we provide an accessible introduction to CP for non-experts who are interested in using CP to solve problems in autonomy. Second, we show how to use CP for the verification of LECs, e.g., for verifying input-output properties of neural networks. Third and fourth, we review recent articles that use CP for safe control design as well as offline and online verification of LEASs. We summarize their ideas in a unifying framework that can deal with the complexity of LEASs in a computationally efficient manner. In our exposition, we consider simple system specifications, e.g., robot navigation tasks, as well as complex specifications formulated in temporal logic formalisms. Throughout our survey, we compare to other statistical techniques (e.g., scenario optimization, PAC-Bayes theory, etc.) and how these techniques have been used in verification and control. Lastly, we point the reader to open problems and future research directions.


Temporal Ensemble Logic

arXiv.org Artificial Intelligence

We introduce Temporal Ensemble Logic (TEL), a monadic, first-order modal logic for linear-time temporal reasoning. TEL includes primitive temporal constructs such as ``always up to $t$ time later'' ($\Box_t$), ``sometimes before $t$ time in the future'' ($\Diamond_t$), and ``$t$-time later'' $\varphi_t$. TEL has been motivated from the requirement for rigor and reproducibility for cohort specification and discovery in clinical and population health research, to fill a gap in formalizing temporal reasoning in biomedicine. Existing logical frameworks such as linear temporal logic are too restrictive to express temporal and sequential properties in biomedicine, or too permissive in semantic constructs, such as in Halpern-Shoham logic, to serve this purpose. In this paper, we first introduce TEL in a general set up, with discrete and dense time as special cases. We then focus on the theoretical development of discrete TEL on the temporal domain of positive integers $\mathbb{N}^+$, denoted as ${\rm TEL}_{\mathbb{N}^+}$. ${\rm TEL}_{\mathbb{N}^+}$ is strictly more expressive than the standard monadic second order logic, characterized by B\"{u}chi automata. We present its formal semantics, a proof system, and provide a proof for the undecidability of the satisfiability of ${\rm TEL}_{\mathbb{N}^+}$. We also include initial results on expressiveness and decidability fragments for ${\rm TEL}_{\mathbb{N}^+}$, followed by application outlook and discussions.


Machine Learning for Quantifier Selection in cvc5

arXiv.org Artificial Intelligence

In this work we considerably improve the state-of-the-art SMT solving on first-order quantified problems by efficient machine learning guidance of quantifier selection. Quantifiers represent a significant challenge for SMT and are technically a source of undecidability. In our approach, we train an efficient machine learning model that informs the solver which quantifiers should be instantiated and which not. Each quantifier may be instantiated multiple times and the set of the active quantifiers changes as the solving progresses. Therefore, we invoke the ML predictor many times, during the whole run of the solver. To make this efficient, we use fast ML models based on gradient boosting decision trees. We integrate our approach into the state-of-the-art cvc5 SMT solver and show a considerable increase of the system's holdout-set performance after training it on a large set of first-order problems collected from the Mizar Mathematical Library.


Differentiable Logic Programming for Distant Supervision

arXiv.org Artificial Intelligence

We introduce a new method for integrating neural networks with logic programming in Neural-Symbolic AI (NeSy), aimed at learning with distant supervision, in which direct labels are unavailable. Unlike prior methods, our approach does not depend on symbolic solvers for reasoning about missing labels. Instead, it evaluates logical implications and constraints in a differentiable manner by embedding both neural network outputs and logic programs into matrices. This method facilitates more efficient learning under distant supervision. We evaluated our approach against existing methods while maintaining a constant volume of training data. The findings indicate that our method not only matches or exceeds the accuracy of other methods across various tasks but also speeds up the learning process. These results highlight the potential of our approach to enhance both accuracy and learning efficiency in NeSy applications.


Boolean Matrix Logic Programming

arXiv.org Artificial Intelligence

We describe a datalog query evaluation approach based on efficient and composable boolean matrix manipulation modules. We first define an overarching problem, Boolean Matrix Logic Programming (BMLP), which uses boolean matrices as an alternative computation to evaluate datalog programs. We develop two novel BMLP modules for bottom-up inferences on linear dyadic recursive datalog programs, and show how additional modules can extend this capability to compute both linear and non-linear recursive datalog programs of arity two. Our empirical results demonstrate that these modules outperform general-purpose and specialised systems by factors of 30x and 9x, respectively, when evaluating large programs with millions of facts. This boolean matrix approach significantly enhances the efficiency of datalog querying to support logic programming techniques.


Relational decomposition for program synthesis

arXiv.org Artificial Intelligence

We introduce a novel approach to program synthesis that decomposes complex functional tasks into simpler relational synthesis sub-tasks. We demonstrate the effectiveness of our approach using an off-the-shelf inductive logic programming (ILP) system on three challenging datasets. Our results show that (i) a relational representation can outperform a functional one, and (ii) an off-the-shelf ILP system with a relational encoding can outperform domain-specific approaches.


Solving Decision Theory Problems with Probabilistic Answer Set Programming

arXiv.org Artificial Intelligence

Solving a decision theory problem usually involves finding the actions, among a set of possible ones, which optimize the expected reward, possibly accounting for the uncertainty of the environment. In this paper, we introduce the possibility to encode decision theory problems with Probabilistic Answer Set Programming under the credal semantics via decision atoms and utility attributes. To solve the task we propose an algorithm based on three layers of Algebraic Model Counting, that we test on several synthetic datasets against an algorithm that adopts answer set enumeration. Empirical results show that our algorithm can manage non trivial instances of programs in a reasonable amount of time.


Towards Probabilistic Inductive Logic Programming with Neurosymbolic Inference and Relaxation

arXiv.org Artificial Intelligence

Many inductive logic programming (ILP) methods are incapable of learning programs from probabilistic background knowledge, e.g. coming from sensory data or neural networks with probabilities. We propose Propper, which handles flawed and probabilistic background knowledge by extending ILP with a combination of neurosymbolic inference, a continuous criterion for hypothesis selection (BCE) and a relaxation of the hypothesis constrainer (NoisyCombo). For relational patterns in noisy images, Propper can learn programs from as few as 8 examples. It outperforms binary ILP and statistical models such as a Graph Neural Network.