Rule-Based Reasoning
Learning Rule-Induced Subgraph Representations for Inductive Relation Prediction
Liu, Tianyu, Lv, Qitan, Wang, Jie, Yang, Shuling, Chen, Hanzhu
Inductive relation prediction (IRP)--where entities can be different during training and inference--has shown great power for completing evolving knowledge graphs. Existing works mainly focus on using graph neural networks (GNNs) to learn the representation of the subgraph induced from the target link, which can be seen as an implicit rule-mining process to measure the plausibility of the target link. However, these methods cannot differentiate the target link and other links during message passing, hence the final subgraph representation will contain irrelevant rule information to the target link, which reduces the reasoning performance and severely hinders the applications for real-world scenarios. To tackle this problem, we propose a novel single-source edge-wise GNN model to learn the Rule-inducEd Subgraph represenTations (REST), which encodes relevant rules and eliminates irrelevant rules within the subgraph. Specifically, we propose a single-source initialization approach to initialize edge features only for the target link, which guarantees the relevance of mined rules and target link. Then we propose several RNN-based functions for edge-wise message passing to model the sequential property of mined rules. REST is a simple and effective approach with theoretical support to learn the rule-induced subgraph representation. Moreover, REST does not need node labeling, which significantly accelerates the subgraph preprocessing time by up to 11.66 .
Semantic Prototypes: Enhancing Transparency Without Black Boxes
Menis-Mastromichalakis, Orfeas, Filandrianos, Giorgos, Liartis, Jason, Dervakos, Edmund, Stamou, Giorgos
As machine learning (ML) models and datasets increase in complexity, the demand for methods that enhance explainability and interpretability becomes paramount. Prototypes, by encapsulating essential characteristics within data, offer insights that enable tactical decision-making and enhance transparency. Traditional prototype methods often rely on sub-symbolic raw data and opaque latent spaces, reducing explainability and increasing the risk of misinterpretations. This paper presents a novel framework that utilizes semantic descriptions to define prototypes and provide clear explanations, effectively addressing the shortcomings of conventional methods. Our approach leverages concept-based descriptions to cluster data on the semantic level, ensuring that prototypes not only represent underlying properties intuitively but are also straightforward to interpret. Our method simplifies the interpretative process and effectively bridges the gap between complex data structures and human cognitive processes, thereby enhancing transparency and fostering trust. Our approach outperforms existing widely-used prototype methods in facilitating human understanding and informativeness, as validated through a user survey.
Improving VTE Identification through Language Models from Radiology Reports: A Comparative Study of Mamba, Phi-3 Mini, and BERT
Deng, Jamie, Wu, Yusen, Yesha, Yelena, Nguyen, Phuong
Venous thromboembolism (VTE) is a critical cardiovascular condition, encompassing deep vein thrombosis (DVT) and pulmonary embolism (PE). Accurate and timely identification of VTE is essential for effective medical care. This study builds upon our previous work, which addressed VTE detection using deep learning methods for DVT and a hybrid approach combining deep learning and rule-based classification for PE. Our earlier approaches, while effective, had two major limitations: they were complex and required expert involvement for feature engineering of the rule set. To overcome these challenges, we utilize the Mamba architecture-based classifier. This model achieves remarkable results, with a 97\% accuracy and F1 score on the DVT dataset and a 98\% accuracy and F1 score on the PE dataset. In contrast to the previous hybrid method on PE identification, the Mamba classifier eliminates the need for hand-engineered rules, significantly reducing model complexity while maintaining comparable performance. Additionally, we evaluated a lightweight Large Language Model (LLM), Phi-3 Mini, in detecting VTE. While this model delivers competitive results, outperforming the baseline BERT models, it proves to be computationally intensive due to its larger parameter set. Our evaluation shows that the Mamba-based model demonstrates superior performance and efficiency in VTE identification, offering an effective solution to the limitations of previous approaches.
Relational Graph Convolutional Networks Do Not Learn Sound Rules
Morris, Matthew, Cucala, David J. Tena, Grau, Bernardo Cuenca, Horrocks, Ian
Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.
ONSEP: A Novel Online Neural-Symbolic Framework for Event Prediction Based on Large Language Model
Yu, Xuanqing, Sun, Wangtao, Li, Jingwei, Liu, Kang, Liu, Chengbao, Tan, Jie
In the realm of event prediction, temporal knowledge graph forecasting (TKGF) stands as a pivotal technique. Previous approaches face the challenges of not utilizing experience during testing and relying on a single short-term history, which limits adaptation to evolving data. In this paper, we introduce the Online Neural-Symbolic Event Prediction (ONSEP) framework, which innovates by integrating dynamic causal rule mining (DCRM) and dual history augmented generation (DHAG). DCRM dynamically constructs causal rules from real-time data, allowing for swift adaptation to new causal relationships. In parallel, DHAG merges short-term and long-term historical contexts, leveraging a bi-branch approach to enrich event prediction. Our framework demonstrates notable performance enhancements across diverse datasets, with significant Hit@k (k=1,3,10) improvements, showcasing its ability to augment large language models (LLMs) for event prediction without necessitating extensive retraining. The ONSEP framework not only advances the field of TKGF but also underscores the potential of neural-symbolic approaches in adapting to dynamic data environments.
Markov Senior -- Learning Markov Junior Grammars to Generate User-specified Content
Oฤuz, Mehmet Kayra, Dockhorn, Alexander
Markov Junior is a probabilistic programming language used for procedural content generation across various domains. However, its reliance on manually crafted and tuned probabilistic rule sets, also called grammars, presents a significant bottleneck, diverging from approaches that allow rule learning from examples. In this paper, we propose a novel solution to this challenge by introducing a genetic programming-based optimization framework for learning hierarchical rule sets automatically. Our proposed method ``Markov Senior'' focuses on extracting positional and distance relations from single input samples to construct probabilistic rules to be used by Markov Junior. Using a Kullback-Leibler divergence-based fitness measure, we search for grammars to generate content that is coherent with the given sample. To enhance scalability, we introduce a divide-and-conquer strategy that enables the efficient generation of large-scale content. We validate our approach through experiments in generating image-based content and Super Mario levels, demonstrating its flexibility and effectiveness. In this way, ``Markov Senior'' allows for the wider application of Markov Junior for tasks in which an example may be available, but the design of a generative rule set is infeasible.
Neurosymbolic Methods for Rule Mining
Lawrynowicz, Agnieszka, Galarraga, Luis, Alam, Mehwish, Jaulmes, Berenice, Zeman, Vaclav, Kliegr, Tomas
In this chapter, we address the problem of rule mining, beginning with essential background information, including measures of rule quality. We then explore various rule mining methodologies, categorized into three groups: inductive logic programming, path sampling and generalization, and linear programming.
Overview of the NLPCC 2024 Shared Task on Chinese Metaphor Generation
Qu, Xingwei, Zhang, Ge, Wu, Siwei, Li, Yizhi, Lin, Chenghua
This paper presents the results of the shared task on Chinese metaphor generation, hosted at the 13th CCF Conference on Natural Language Processing and Chinese Computing (NLPCC 2024). The goal of this shared task is to generate Chinese metaphors using machine learning techniques and effectively identifying basic components of metaphorical sentences. It is divided into two subtasks: 1) Metaphor Generation, which involves creating a metaphor from a provided tuple consisting of TENOR, GROUND, and VEHICLE. The goal here is to synthesize a metaphor that connects the subject (i.e. TENOR) with the object (i.e. VEHICLE), guided by the concept of the GROUND. 2) Metaphor Components Identification, which extracts the most fitting TENORs, GROUNDs, and VEHICLEs from a metaphorical sentence. This component requires the identification of the most fitting metaphor elements that correspond to the specified grounds. In addition to overall results, we report on the setup and insights from the metaphor generation shared task, which attracted a total of 4 participating teams across both subtasks.
Can Rule-Based Insights Enhance LLMs for Radiology Report Classification? Introducing the RadPrompt Methodology
Fytas, Panagiotis, Breger, Anna, Selby, Ian, Baker, Simon, Shahipasand, Shahab, Korhonen, Anna
Developing imaging models capable of detecting pathologies from chest X-rays can be cost and time-prohibitive for large datasets as it requires supervision to attain state-of-the-art performance. Instead, labels extracted from radiology reports may serve as distant supervision since these are routinely generated as part of clinical practice. Despite their widespread use, current rule-based methods for label extraction rely on extensive rule sets that are limited in their robustness to syntactic variability. To alleviate these limitations, we introduce RadPert, a rule-based system that integrates an uncertainty-aware information schema with a streamlined set of rules, enhancing performance. Additionally, we have developed RadPrompt, a multi-turn prompting strategy that leverages RadPert to bolster the zero-shot predictive capabilities of large language models, achieving a statistically significant improvement in weighted average F1 score over GPT-4 Turbo. Most notably, RadPrompt surpasses both its underlying models, showcasing the synergistic potential of LLMs with rule-based models. We have evaluated our methods on two English Corpora: the MIMIC-CXR gold-standard test set and a gold-standard dataset collected from the Cambridge University Hospitals.
Enhancing Medical Learning and Reasoning Systems: A Boxology-Based Comparative Analysis of Design Patterns
This study analyzes hybrid AI systems' design patterns and their effectiveness in clinical decision-making using the boxology framework. It categorizes and copares various architectures combining machine learning and rule-based reasoning to provide insights into their structural foundations and healthcare applications. Addressing two main questions, how to categorize these systems againts established design patterns and how to extract insights through comparative analysis, the study uses design patterns from software engineering to understand and optimize healthcare AI systems. Boxology helps identify commonalities and create reusable solutions, enhancing these systems' scalability, reliability, and performance. Five primary architectures are examined: REML, MLRB, RBML, RMLT, and PERML. Each has unique strengths and weaknesses, highlighting the need for tailored approaches in clinical tasks. REML excels in high-accuracy prediction for datasets with limited data; MLRB in handling large datasets and complex data integration; RBML in explainability and trustworthiness; RMLT in managing high-dimensional data; and PERML, though limited in analysis, shows promise in urgent care scenarios. The study introduces four new patterns, creates five abstract categorization patterns, and refines those five further to specific systems. These contributions enhance Boxlogy's taxonomical organization and offer novel approaches to integrating expert knowledge with machine learning. Boxology's structured, modular apporach offers significant advantages in developing and analyzing hybrid AI systems, revealing commonalities, and promoting reusable solutions. In conclusion, this study underscores hybrid AI systems' crucial role in advancing healthcare and Boxology's potential to drive further innovation in AI integration, ultimately improving clinical decision support and patient outcomes.