Bain, Michael
A Comprehensive Survey on Integrating Large Language Models with Knowledge-Based Methods
Some, Lilian, Yang, Wenli, Bain, Michael, Kang, Byeong
The rapid development of artificial intelligence has brought about substantial advancements in the field. One promising direction is the integration of Large Language Models (LLMs) with structured knowledge-based systems. This approach aims to enhance AI capabilities by combining the generative language understanding of LLMs with the precise knowledge representation of structured systems. This survey explores the synergy between LLMs and knowledge bases, focusing on real-world applications and addressing associated technical, operational, and ethical challenges. Through a comprehensive literature review, the study identifies critical issues and evaluates existing solutions. The paper highlights the benefits of integrating generative AI with knowledge bases, including improved data contextualization, enhanced model accuracy, and better utilization of knowledge resources. The findings provide a detailed overview of the current state of research, identify key gaps, and offer actionable recommendations. These insights contribute to advancing AI technologies and support their practical deployment across various sectors.
Defining Reference Sequences for Nocardia Species by Similarity and Clustering Analyses of 16S rRNA Gene Sequence Data
Helal, Manal, Kong, Fanrong, Chen, Sharon C. A., Bain, Michael, Christen, Richard, Sintchenko, Vitali
The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM) of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. Results: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52%) corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578.
A Protocol for Intelligible Interaction Between Agents That Learn and Explain
Srinivasan, Ashwin, Bain, Michael, Baskar, A., Coiera, Enrico
Recent engineering developments have seen the emergence of Machine Learning (ML) as a powerful form of data analysis with widespread applicability beyond its historical roots in the design of autonomous agents. However, relatively little attention has been paid to the interaction between people and ML systems. Recent developments on Explainable ML address this by providing visual and textual information on how the ML system arrived at a conclusion. In this paper we view the interaction between humans and ML systems within the broader context of interaction between agents capable of learning and explanation. Within this setting, we argue that it is more helpful to view the interaction as characterised by two-way intelligibility of information rather than once-off explanation of a prediction. We formulate two-way intelligibility as a property of a communication protocol. Development of the protocol is motivated by a set of `Intelligibility Axioms' for decision-support systems that use ML with a human-in-the-loop. The axioms are intended as sufficient criteria to claim that: (a) information provided by a human is intelligible to an ML system; and (b) information provided by an ML system is intelligible to a human. The axioms inform the design of a general synchronous interaction model between agents capable of learning and explanation. We identify conditions of compatibility between agents that result in bounded communication, and define Weak and Strong Two-Way Intelligibility between agents as properties of the communication protocol.
Logical Explanations for Deep Relational Machines Using Relevance Information
Srinivasan, Ashwin, Vig, Lovekesh, Bain, Michael
Our interest in this paper is in the construction of symbolic explanations for predictions made by a deep neural network. We will focus attention on deep relational machines (DRMs, first proposed by H. Lodhi). A DRM is a deep network in which the input layer consists of Boolean-valued functions (features) that are defined in terms of relations provided as domain, or background, knowledge. Our DRMs differ from those proposed by Lodhi, which use an Inductive Logic Programming (ILP) engine to first select features (we use random selections from a space of features that satisfies some approximate constraints on logical relevance and non-redundancy). But why do the DRMs predict what they do? One way of answering this is the LIME setting, in which readable proxies for a black-box predictor. The proxies are intended only to model the predictions of the black-box in local regions of the instance-space. But readability alone may not enough: to be understandable, the local models must use relevant concepts in an meaningful manner. We investigate the use of a Bayes-like approach to identify logical proxies for local predictions of a DRM. We show: (a) DRM's with our randomised propositionalization method achieve state-of-the-art predictive performance; (b) Models in first-order logic can approximate the DRM's prediction closely in a small local region; and (c) Expert-provided relevance information can play the role of a prior to distinguish between logical explanations that perform equivalently on prediction alone.
A Deployed People-to-People Recommender System in Online Dating
Wobcke, Wayne (University of New South Wales) | Krzywicki, Alfred (University of New South Wales) | Kim, Yang Sok (Keimyung University) | Cai, Xiongcai (University of New South Wales) | Bain, Michael (University of New South Wales) | Compton, Paul (University of New South Wales) | Mahidadia, Ashesh (smartAcademic)
Online dating is a prime application area for recommender systems, as users face an abundance of choice, must act on limited information, and are participating in a competitive matching market. The deployment was the result of thorough evaluation and an online trial of a number of methods, including profile-based, collaborative filtering and hybrid algorithms. Results taken a few months after deployment show that the recommender system delivered its projected benefits.
A Deployed People-to-People Recommender System in Online Dating
Wobcke, Wayne (University of New South Wales) | Krzywicki, Alfred (University of New South Wales) | Kim, Yang Sok (Keimyung University) | Cai, Xiongcai (University of New South Wales) | Bain, Michael (University of New South Wales) | Compton, Paul (University of New South Wales) | Mahidadia, Ashesh (smartAcademic)
Online dating is a prime application area for recommender systems, as users face an abundance of choice, must act on limited information, and are participating in a competitive matching market. This article reports on the successful deployment of a people-to-people recommender system on a large commercial online dating site. The deployment was the result of thorough evaluation and an online trial of a number of methods, including profile-based, collaborative filtering and hybrid algorithms. Results taken a few months after deployment show that the recommender system delivered its projected benefits.
Evaluation and Deployment of a People-to-People Recommender in Online Dating
Krzywicki, Alfred (University of New South Wales) | Wobcke, Wayne (University of New South Wales) | Kim, Yang Sok (University of New South Wales) | Cai, Xiongcai (University of New South Wales) | Bain, Michael (University of New South Wales) | Compton, Paul (University of New South Wales) | Mahidadia, Ashesh (University of New South Wales)
This paper reports on the successful deployment of a people-to-people recommender system in a large commercial online dating site. The deployment was the result of thorough evaluation and an online trial of a number of methods, including profile-based, collaborative filtering and hybrid algorithms. Results taken a few months after deployment show that key metrics generally hold their value or show an increase compared to the trial results, and that the recommender system delivered its projected benefits.