data property
- North America > United States > Michigan (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- (2 more...)
A Metadata-Driven Approach to Understand Graph Neural Networks
Graph Neural Networks (GNNs) have achieved remarkable success in various applications, but their performance can be sensitive to specific data properties of the graph datasets they operate on. Current literature on understanding the limitations of GNNs has primarily employed a \emph{model-driven} approach that leverage heuristics and domain knowledge from network science or graph theory to model the GNN behaviors, which is time-consuming and highly subjective. In this work, we propose a \emph{metadata-driven} approach to analyze the sensitivity of GNNs to graph data properties, motivated by the increasing availability of graph learning benchmarks. We perform a multivariate sparse regression analysis on the metadata derived from benchmarking GNN performance across diverse datasets, yielding a set of salient data properties. To validate the effectiveness of our data-driven approach, we focus on one identified data property, the degree distribution, and investigate how this property influences GNN performance through theoretical analysis and controlled experiments. Our theoretical findings reveal that datasets with more balanced degree distribution exhibit better linear separability of node representations, thus leading to better GNN performance. We also conduct controlled experiments using synthetic datasets with varying degree distributions, and the results align well with our theoretical findings. Collectively, both the theoretical analysis and controlled experiments verify that the proposed metadata-driven approach is effective in identifying critical data properties for GNNs.
- North America > United States > Michigan (0.04)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas (0.04)
- (2 more...)
A Domain Ontology for Modeling the Book of Purification in Islam
This paper aims to address a gap in major Islamic topics by developing an ontology for the Book of Purification in Islam. Many authoritative Islamic texts begin with the Book of Purification, as it is essential for performing prayer (the second pillar of Islam after Shahadah, the profession of faith) and other religious duties such as Umrah and Hajj. The ontology development strategy followed six key steps: (1) domain identification, (2) knowledge acquisition, (3) conceptualization, (4) classification, (5) integration and implementation, and (6) ontology generation. This paper includes examples of the constructed tables and classifications. The focus is on the design and analysis phases, as technical implementation is beyond the scope of this study. However, an initial implementation is provided to illustrate the steps of the proposed strategy. The developed ontology ensures reusability by formally defining and encoding the key concepts, attributes, and relationships related to the Book of Purification. This structured representation is intended to support knowledge sharing and reuse.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
Efficient OWL2QL Meta-reasoning Using ASP-based Hybrid Knowledge Bases
Qureshi, Haya Majid, Faber, Wolfgang
Metamodeling helps in specifying conceptual modelling requirements with the notion of meta-classes (for instance, classes that are instances of other classes) and meta-properties (relations between metaconcepts). These notions can be expressed in OWL Full. However, OWL Full is so expressive for metamodeling that it leads to undecidability [13]. OWL 2 DL and its sub-profiles guarantee decidability, but they provide a very restricted form of metamodeling [7] and give no semantic support due to the prevalent Direct Semantics (DS). Consider an example adapted from [6], concerning the modeling of biological species, stating that all golden eagles are eagles, all eagles are birds, and Harry is an instance of GoldenEagle, which further can be inferred as an instance of Eagle and Bird. However, in the species domain one can not just express properties of and relationships among species, but also express properties of the species themselves. For example "GoldenEagle is listed in the IUCN Red List of endangered species" states that GoldenEagle as a whole class is an endangered species. Note that this is also not a subclass relation, as Harry is not an endangered species. To formally model this expression, we can declare GoldenEagle to be an instance of new class EndangeredSpecies.
A Generative AI-driven Metadata Modelling Approach
Since decades, the modelling of metadata has been core to the functioning of any academic library. Its importance has only enhanced with the increasing pervasiveness of Generative Artificial Intelligence (AI)-driven information activities and services which constitute a library's outreach. However, with the rising importance of metadata, there arose several outstanding problems with the process of designing a library metadata model impacting its reusability, crosswalk and interoperability with other metadata models. This paper posits that the above problems stem from an underlying thesis that there should only be a few core metadata models which would be necessary and sufficient for any information service using them, irrespective of the heterogeneity of intra-domain or inter-domain settings. To that end, this paper advances a contrary view of the above thesis and substantiates its argument in three key steps. First, it introduces a novel way of thinking about a library metadata model as an ontology-driven composition of five functionally interlinked representation levels from perception to its intensional definition via properties. Second, it introduces the representational manifoldness implicit in each of the five levels which cumulatively contributes to a conceptually entangled library metadata model. Finally, and most importantly, it proposes a Generative AI-driven Human-Large Language Model (LLM) collaboration based metadata modelling approach to disentangle the entanglement inherent in each representation level leading to the generation of a conceptually disentangled metadata model. Throughout the paper, the arguments are exemplified by motivating scenarios and examples from representative libraries handling cancer information.
- Asia > India > Karnataka > Bengaluru (0.04)
- South America > Brazil (0.04)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province > Trento (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
A Metadata-Driven Approach to Understand Graph Neural Networks
Graph Neural Networks (GNNs) have achieved remarkable success in various applications, but their performance can be sensitive to specific data properties of the graph datasets they operate on. Current literature on understanding the limitations of GNNs has primarily employed a \emph{model-driven} approach that leverage heuristics and domain knowledge from network science or graph theory to model the GNN behaviors, which is time-consuming and highly subjective. In this work, we propose a \emph{metadata-driven} approach to analyze the sensitivity of GNNs to graph data properties, motivated by the increasing availability of graph learning benchmarks. We perform a multivariate sparse regression analysis on the metadata derived from benchmarking GNN performance across diverse datasets, yielding a set of salient data properties. To validate the effectiveness of our data-driven approach, we focus on one identified data property, the degree distribution, and investigate how this property influences GNN performance through theoretical analysis and controlled experiments.
Knowledge-based Drug Samples' Comparison
Guillemin, Sébastien, Roxin, Ana, Dujourdy, Laurence, Journaux, Ludovic
-- Drug sample comparison is a process used by the French National Police to identify drug distribution networks. The current approach is based on a manual comparison done by forensic experts. In this article, we present our approach to acquire, formalise, and specify expert knowledge to improve the current process. We use an ontology coupled with logical rules to model the underlying knowledge. The different steps of our approach are designed to be reused in other application domains. The results obtained are explainable making them usable by experts in different fields. The fight against drug trafficking has been one of the French government's priorities since the end of 2019 and has led to the creation of the National Stup plan. This plan comprises 55 measures, including the use of new indicators to understand consumer habits and dealers' methods. The work described in this article is part of this plan and aims to support scientific experts in the decision-making process for narcotic profiling. As part of the fight against drug trafficking, several arrests may be made, often accompanied by seizures. Forensic experts perform several analyses on samples from a seizure. They aim to correlate different samples from different seizures to identify trafficking networks best. To do so, experts use sample matching to pair samples according to their characteristics. Paired samples constitute an ensemble called a batch. The sample characteristics used are represented by different data, namely: macroscopic data (e.g., sample dimension, drug logos), qualitative data (e.g., list of active substances), quantitative data (e.g., dosage of substances) or non-confidential seizure data (e.g., date, place of seizure). In France, such data is stored in the national STUPS database.
- Europe > France > Bourgogne-Franche-Comté > Côte-d'Or > Dijon (0.05)
- Europe > Ireland (0.04)
- Asia > Singapore (0.04)
- (3 more...)
- Workflow (0.68)
- Research Report (0.64)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law > Criminal Law (0.69)
Deep Causal Generative Models with Property Control
Zhao, Qilong, Wang, Shiyu, Bai, Guangji, Pan, Bo, Qin, Zhaohui, Zhao, Liang
Generating data with properties of interest by external users while following the right causation among its intrinsic factors is important yet has not been well addressed jointly. This is due to the long-lasting challenge of jointly identifying key latent variables, their causal relations, and their correlation with properties of interest, as well as how to leverage their discoveries toward causally controlled data generation. To address these challenges, we propose a novel deep generative framework called the Correlation-aware Causal Variational Auto-encoder (C2VAE). This framework simultaneously recovers the correlation and causal relationships between properties using disentangled latent vectors. Specifically, causality is captured by learning the causal graph on latent variables through a structural causal model, while correlation is learned via a novel correlation pooling algorithm. Extensive experiments demonstrate C2VAE's ability to accurately recover true causality and correlation, as well as its superiority in controllable data generation compared to baseline models.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Particle physics DL-simulation with control over generated data properties
Rogoziński, Karol, Dubiński, Jan, Rokita, Przemysław, Deja, Kamil
The research of innovative methods aimed at reducing costs and shortening the time needed for simulation, going beyond conventional approaches based on Monte Carlo methods, has been sparked by the development of collision simulations at the Large Hadron Collider at CERN. Deep learning generative methods including VAE, GANs and diffusion models have been used for this purpose. Although they are much faster and simpler than standard approaches, they do not always keep high fidelity of the simulated data. This work aims to mitigate this issue, by providing an alternative solution to currently employed algorithms by introducing the mechanism of control over the generated data properties. To achieve this, we extend the recently introduced CorrVAE, which enables user-defined parameter manipulation of the generated output. We adapt the model to the problem of particle physics simulation. The proposed solution achieved promising results, demonstrating control over the parameters of the generated output and constituting an alternative for simulating the ZDC calorimeter in the ALICE experiment at CERN.
- Research Report > New Finding (0.49)
- Research Report > Promising Solution (0.34)