Lobo, Jorge
The Role of Foundation Models in Neuro-Symbolic Learning and Reasoning
Cunnington, Daniel, Law, Mark, Lobo, Jorge, Russo, Alessandra
Neuro-Symbolic AI (NeSy) holds promise to ensure the safe deployment of AI systems, as interpretable symbolic techniques provide formal behaviour guarantees. The challenge is how to effectively integrate neural and symbolic computation, to enable learning and reasoning from raw data. Existing pipelines that train the neural and symbolic components sequentially require extensive labelling, whereas end-to-end approaches are limited in terms of scalability, due to the combinatorial explosion in the symbol grounding problem. In this paper, we leverage the implicit knowledge within foundation models to enhance the performance in NeSy tasks, whilst reducing the amount of data labelling and manual engineering. We introduce a new architecture, called NeSyGPT, which fine-tunes a vision-language foundation model to extract symbolic features from raw data, before learning a highly expressive answer set program to solve a downstream task. Our comprehensive evaluation demonstrates that NeSyGPT has superior accuracy over various baselines, and can scale to complex NeSy tasks. Finally, we highlight the effective use of a large language model to generate the programmatic interface between the neural and symbolic components, significantly reducing the amount of manual engineering required.
Neuro-Symbolic Learning of Answer Set Programs from Raw Data
Cunnington, Daniel, Law, Mark, Lobo, Jorge, Russo, Alessandra
One of the ultimate goals of Artificial Intelligence is to assist humans in complex decision making. A promising direction for achieving this goal is Neuro-Symbolic AI, which aims to combine the interpretability of symbolic techniques with the ability of deep learning to learn from raw data. However, most current approaches require manually engineered symbolic knowledge, and where end-to-end training is considered, such approaches are either restricted to learning definite programs, or are restricted to training binary neural networks. In this paper, we introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data, whilst learning symbolic knowledge that maps latent concepts to target labels. The novelty of our approach is a method for biasing the learning of symbolic knowledge, based on the in-training performance of both neural and symbolic components. We evaluate NSIL on three problem domains of different complexity, including an NP-complete problem. Our results demonstrate that NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency. Code and technical appendix: https://github.com/DanCunnington/NSIL
FF-NSL: Feed-Forward Neural-Symbolic Learner
Cunnington, Daniel, Law, Mark, Russo, Alessandra, Lobo, Jorge
Logic-based machine learning [1, 2] learns interpretable knowledge expressed in the form of a logic program, called a hypothesis, that explains labelled examples in the context of (optional) background knowledge. Recent logic-based machine learning systems have demonstrated the ability to learn highly complex and noise-tolerant hypotheses in a data efficient manner (e.g., Learning from Answer Sets (LAS) [2]). However, they require labelled examples to be specified in a structured logical form, which limits their applicability to many real-world problems. On the other hand, differentiable learning systems, such as (deep) neural networks, are able to learn directly from unstructured data, but they require large amounts of training data and their learned models are difficult to interpret [3]. Within neural-symbolic artificial intelligence, many approaches aim to integrate neural and symbolic systems with the goal of preserving the benefits of both paradigms [4, 5]. Most neural-symbolic integrations assume the existence of pre-defined knowledge expressed symbolically, or logically, and focus on training a neural network to extract symbolic features from raw unstructured data [6-10]. In this paper, we introduce Feed-Forward Neural-Symbolic Learner (FFNSL), a neural-symbolic learning framework that assumes the opposite. Given a pre-trained neural network, FFNSL uses a logic-based machine learning system robust to noise to learn a logic-based hypothesis whose symbolic features are constructed from neural network predictions.
NSL: Hybrid Interpretable Learning From Noisy Raw Data
Cunnington, Daniel, Russo, Alessandra, Law, Mark, Lobo, Jorge, Kaplan, Lance
Inductive Logic Programming (ILP) systems learn generalised, interpretable rules in a data-efficient manner utilising existing background knowledge. However, current ILP systems require training examples to be specified in a structured logical format. Neural networks learn from unstructured data, although their learned models may be difficult to interpret and are vulnerable to data perturbations at run-time. This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data. NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics. Features extracted by the neural components define the structured context of labelled examples and the confidence of the neural predictions determines the level of noise of the examples. Using the scoring function of FastLAS, NSL searches for short, interpretable rules that generalise over such noisy examples. We evaluate our framework on propositional and first-order classification tasks using the MNIST dataset as raw data. Specifically, we demonstrate that NSL is able to learn robust rules from perturbed MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines whilst being more general and interpretable.