Ontologies
Developing and Deploying Industry Standards for Artificial Intelligence in Education (AIED): Challenges, Strategies, and Future Directions
Tong, Richard, Li, Haoyang, Liang, Joleen, Wen, Qingsong
The adoption of Artificial Intelligence in Education (AIED) holds the promise of revolutionizing educational practices by offering personalized learning experiences, automating administrative and pedagogical tasks, and reducing the cost of content creation. However, the lack of standardized practices in the development and deployment of AIED solutions has led to fragmented ecosystems, which presents challenges in interoperability, scalability, and ethical governance. This article aims to address the critical need to develop and implement industry standards in AIED, offering a comprehensive analysis of the current landscape, challenges, and strategic approaches to overcome these obstacles. We begin by examining the various applications of AIED in various educational settings and identify key areas lacking in standardization, including system interoperability, ontology mapping, data integration, evaluation, and ethical governance. Then, we propose a multi-tiered framework for establishing robust industry standards for AIED. In addition, we discuss methodologies for the iterative development and deployment of standards, incorporating feedback loops from real-world applications to refine and adapt standards over time. The paper also highlights the role of emerging technologies and pedagogical theories in shaping future standards for AIED. Finally, we outline a strategic roadmap for stakeholders to implement these standards, fostering a cohesive and ethical AIED ecosystem. By establishing comprehensive industry standards, such as those by IEEE Artificial Intelligence Standards Committee (AISC) and International Organization for Standardization (ISO), we can accelerate and scale AIED solutions to improve educational outcomes, ensuring that technological advances align with the principles of inclusivity, fairness, and educational excellence.
Using Large Language Models for OntoClean-based Ontology Refinement
Zhao, Yihang, Vetter, Neil, Aryan, Kaveh
This paper explores the integration of Large Language Models (LLMs) such as GPT-3.5 and GPT-4 into the ontology refinement process, specifically focusing on the OntoClean methodology. OntoClean, critical for assessing the metaphysical quality of ontologies, involves a two-step process of assigning meta-properties to classes and verifying a set of constraints. Manually conducting the first step proves difficult in practice, due to the need for philosophical expertise and lack of consensus among ontologists. By employing LLMs with two prompting strategies, the study demonstrates that high accuracy in the labelling process can be achieved. The findings suggest the potential for LLMs to enhance ontology refinement, proposing the development of plugin software for ontology tools to facilitate this integration.
KIF: A Framework for Virtual Integration of Heterogeneous Knowledge Bases using Wikidata
Lima, Guilherme, Machado, Marcelo, Soares, Elton, Fiorini, Sandro R., Thiago, Raphael, Azevedo, Leonardo G., da Silva, Viviane T., Cerqueira, Renato
We present a knowledge integration framework (called KIF) that uses Wikidata as a lingua franca to integrate heterogeneous knowledge bases. These can be triplestores, relational databases, CSV files, etc., which may or may not use the Wikidata dialect of RDF. KIF leverages Wikidata's data model and vocabulary plus user-defined mappings to expose a unified view of the integrated bases while keeping track of the context and provenance of their statements. The result is a virtual knowledge base which behaves like an "extended Wikidata" and which can be queried either through an efficient filter interface or using SPARQL. We present the design and implementation of KIF, discuss how we have used it to solve a real integration problem in the domain of chemistry (involving Wikidata, PubChem, and IBM CIRCA), and present experimental results on the performance and overhead of KIF.
Mapping cognitive ontologies to and from the brain
Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combine a wide exploration of the various brain functions and some inversion of the statistical inference. Here we introduce a methodology for accumulating knowledge towards a bidirectional link between observed brain activity and the corresponding function. We rely on a large corpus of imaging studies and a predictive engine. Technically, the challenges are to find commonality between the studies without denaturing the richness of the corpus. The key elements that we contribute are labeling the tasks performed with a cognitive ontology, and modeling the long tail of rare paradigms in the corpus. To our knowledge, our approach is the first demonstration of predicting the cognitive content of completely new brain images. To that end, we propose a method that predicts the experimental paradigms across different studies.
GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored by Compliance, Context and Attribute
Nowrozy, Raza, Ahmed, Khandakar, Wang, Hua
As digital healthcare evolves, the security of electronic health records (EHR) becomes increasingly crucial. This study presents the GPT-Onto-CAABAC framework, integrating Generative Pretrained Transformer (GPT), medical-legal ontologies and Context-Aware Attribute-Based Access Control (CAABAC) to enhance EHR access security. Unlike traditional models, GPT-Onto-CAABAC dynamically interprets policies and adapts to changing healthcare and legal environments, offering customized access control solutions. Through empirical evaluation, this framework is shown to be effective in improving EHR security by accurately aligning access decisions with complex regulatory and situational requirements. The findings suggest its broader applicability in sectors where access control must meet stringent compliance and adaptability standards.
HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers
Dibitonto, Francesco, Garcea, Fabio, Panisson, Andrรฉ, Perotti, Alan, Morra, Lia
Convolutional Neural Networks (CNNs) are nowadays the model of choice in Computer Vision, thanks to their ability to automatize the feature extraction process in visual tasks. However, the knowledge acquired during training is fully subsymbolic, and hence difficult to understand and explain to end users. In this paper, we propose a new technique called HOLMES (HOLonym-MEronym based Semantic inspection) that decomposes a label into a set of related concepts, and provides component-level explanations for an image classification model. Specifically, HOLMES leverages ontologies, web scraping and transfer learning to automatically construct meronym (parts)-based detectors for a given holonym (class). Then, it produces heatmaps at the meronym level and finally, by probing the holonym CNN with occluded images, it highlights the importance of each part on the classification output. Compared to state-of-the-art saliency methods, HOLMES takes a step further and provides information about both where and what the holonym CNN is looking at, without relying on densely annotated datasets and without forcing concepts to be associated to single computational units. Extensive experimental evaluation on different categories of objects (animals, tools and vehicles) shows the feasibility of our approach. On average, HOLMES explanations include at least two meronyms, and the ablation of a single meronym roughly halves the holonym model confidence. The resulting heatmaps were quantitatively evaluated using the deletion/insertion/preservation curves. All metrics were comparable to those achieved by GradCAM, while offering the advantage of further decomposing the heatmap in human-understandable concepts, thus highlighting both the relevance of meronyms to object classification, as well as HOLMES ability to capture it. The code is available at https://github.com/FrancesC0de/HOLMES.
From human experts to machines: An LLM supported approach to ontology and knowledge graph construction
Kommineni, Vamsi Krishna, Kรถnig-Ries, Birgitta, Samuel, Sheeba
The conventional process of building Ontologies and Knowledge Graphs (KGs) heavily relies on human domain experts to define entities and relationship types, establish hierarchies, maintain relevance to the domain, fill the ABox (or populate with instances), and ensure data quality (including amongst others accuracy and completeness). On the other hand, Large Language Models (LLMs) have recently gained popularity for their ability to understand and generate human-like natural language, offering promising ways to automate aspects of this process. This work explores the (semi-)automatic construction of KGs facilitated by open-source LLMs. Our pipeline involves formulating competency questions (CQs), developing an ontology (TBox) based on these CQs, constructing KGs using the developed ontology, and evaluating the resultant KG with minimal to no involvement of human experts. We showcase the feasibility of our semi-automated pipeline by creating a KG on deep learning methodologies by exploiting scholarly publications. To evaluate the answers generated via Retrieval-Augmented-Generation (RAG) as well as the KG concepts automatically extracted using LLMs, we design a judge LLM, which rates the generated content based on ground truth. Our findings suggest that employing LLMs could potentially reduce the human effort involved in the construction of KGs, although a human-in-the-loop approach is recommended to evaluate automatically generated KGs.
Couler: Unified Machine Learning Workflow Optimization in Cloud
Wang, Xiaoda, Tang, Yuan, Guo, Tengda, Sang, Bo, Wu, Jingji, Sha, Jian, Zhang, Ke, Qian, Jiang, Tang, Mingjie
Machine Learning (ML) has become ubiquitous, fueling data-driven applications across various organizations. Contrary to the traditional perception of ML in research, ML workflows can be complex, resource-intensive, and time-consuming. Expanding an ML workflow to encompass a wider range of data infrastructure and data types may lead to larger workloads and increased deployment costs. Currently, numerous workflow engines are available (with over ten being widely recognized). This variety poses a challenge for end-users in terms of mastering different engine APIs. While efforts have primarily focused on optimizing ML Operations (MLOps) for a specific workflow engine, current methods largely overlook workflow optimization across different engines. In this work, we design and implement Couler, a system designed for unified ML workflow optimization in the cloud. Our main insight lies in the ability to generate an ML workflow using natural language (NL) descriptions. We integrate Large Language Models (LLMs) into workflow generation, and provide a unified programming interface for various workflow engines. This approach alleviates the need to understand various workflow engines' APIs. Moreover, Couler enhances workflow computation efficiency by introducing automated caching at multiple stages, enabling large workflow auto-parallelization and automatic hyperparameters tuning. These enhancements minimize redundant computational costs and improve fault tolerance during deep learning workflow training. Couler is extensively deployed in real-world production scenarios at Ant Group, handling approximately 22k workflows daily, and has successfully improved the CPU/Memory utilization by more than 15% and the workflow completion rate by around 17%.
RoboCertProb: Property Specification for Probabilistic RoboChart Models
RoboChart is a core notation in the RoboStar framework which brings modern modelling and formal verification technologies into software engineering for robotics. It is a timed and probabilistic domain-specific language for robotics and provides a UML-like architectural and state machine modelling. This work presents RoboCertProb for specifying quantitative properties of probabilistic robotic systems modelled in RoboChart. RoboCertProb's semantics is based on PCTL*. To interpret RoboCertProb over RoboChart models, we give a Markov semantics (DTMCs and MDPs) to RoboChart, derived from its existing transformation semantics to the PRISM language. In addition to property specification, RoboCertProb also entitles us to configure loose constants and unspecified functions and operations in RoboChart models. It allows us to set up environmental inputs to verify reactive probabilistic systems not directly supported in probabilistic model checkers like PRISM because they employ a closed-world assumption. We implement RoboCertProb in an accompanying tool of RoboChart, RoboTool, for specifying properties and automatically generating PRISM properties from them to formally verify RoboChart models using PRISM. We have used it to analyse the behaviour of software controllers for two real robots: an industrial painting robot and an agricultural robot for treating plants with UV lights.