rdf
MXtalTools: A Toolkit for Machine Learning on Molecular Crystals
Kilgour, Michael, Tuckerman, Mark E., Rogal, Jutta
We present MXtalTools, a flexible Python package for the data-driven modelling of molecular crystals, facilitating machine learning studies of the molecular solid state. MXtalTools comprises several classes of utilities: (1) synthesis, collation, and curation of molecule and crystal datasets, (2) integrated workflows for model training and inference, (3) crystal parameterization and representation, (4) crystal structure sampling and optimization, (5) end-to-end differentiable crystal sampling, construction and analysis. Our modular functions can be integrated into existing workflows or combined and used to build novel modelling pipelines. MXtalTools leverages CUDA acceleration to enable high-throughput crystal modelling. The Python code is available open-source on our GitHub page, with detailed documentation on ReadTheDocs.
- North America > United States (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- Workflow (0.58)
- Research Report (0.50)
KGpipe: Generation and Evaluation of Pipelines for Data Integration into Knowledge Graphs
Building high-quality knowledge graphs (KGs) from diverse sources requires combining methods for information extraction, data transformation, ontology mapping, entity matching, and data fusion. Numerous methods and tools exist for each of these tasks, but support for combining them into reproducible and effective end-to-end pipelines is still lacking. We present a new framework, KGpipe for defining and executing integration pipelines that can combine existing tools or LLM (Large Language Model) functionality. To evaluate different pipelines and the resulting KGs, we propose a benchmark to integrate heterogeneous data of different formats (RDF, JSON, text) into a seed KG. We demonstrate the flexibility of KGpipe by running and comparatively evaluating several pipelines integrating sources of the same or different formats using selected performance and quality metrics.
- Europe > Germany > Saxony > Leipzig (1.00)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- (9 more...)
- Media > Film (0.93)
- Leisure & Entertainment (0.93)
An Epidemiological Knowledge Graph extracted from the World Health Organization's Disease Outbreak News
Consoli, Sergio, Coletti, Pietro, Markov, Peter V., Orfei, Lia, Biazzo, Indaco, Schuh, Lea, Stefanovitch, Nicolas, Bertolini, Lorenzo, Ceresa, Mario, Stilianakis, Nikolaos I.
The rapid evolution of artificial intelligence (AI), together with the increased availability of social media and news for epidemiological surveillance, are marking a pivotal moment in epidemiology and public health research. Leveraging the power of generative AI, we use an ensemble approach which incorporates multiple Large Language Models (LLMs) to extract valuable actionable epidemiological information from the World Health Organization (WHO) Disease Outbreak News (DONs). DONs is a collection of regular reports on global outbreaks curated by the WHO and the adopted decision-making processes to respond to them. The extracted information is made available in a daily-updated dataset and a knowledge graph, referred to as eKG, derived to provide a nuanced representation of the public health domain knowledge. We provide an overview of this new dataset and describe the structure of eKG, along with the services and tools used to access and utilize the data that we are building on top. These innovative data resources open altogether new opportunities for epidemiological research, and the analysis and surveillance of disease outbreaks.
- North America > Trinidad and Tobago (0.14)
- Europe > Italy (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- (16 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
Exploring a Large Language Model for Transforming Taxonomic Data into OWL: Lessons Learned and Implications for Ontology Development
Soares, Filipi Miranda, Saraiva, Antonio Mauro, Pires, Luís Ferreira, Santos, Luiz Olavo Bonino da Silva, Moreira, Dilvan de Abreu, Corrêa, Fernando Elias, Braghetto, Kelly Rosa, Drucker, Debora Pignatari, Delbem, Alexandre Cláudio Botazzo
Managing scientific names in ontologies that represent species taxonomies is challenging due to the ever-evolving nature of these taxonomies. Manually maintaining these names becomes increasingly difficult when dealing with thousands of scientific names. To address this issue, this paper investigates the use of ChatGPT-4 to automate the development of the :Organism module in the Agricultural Product Types Ontology (APTO) for species classification. Our methodology involved leveraging ChatGPT-4 to extract data from the GBIF Backbone API and generate OWL files for further integration in APTO. Two alternative approaches were explored: (1) issuing a series of prompts for ChatGPT-4 to execute tasks via the BrowserOP plugin and (2) directing ChatGPT-4 to design a Python algorithm to perform analogous tasks. Both approaches rely on a prompting method where we provide instructions, context, input data, and an output indicator. The first approach showed scalability limitations, while the second approach used the Python algorithm to overcome these challenges, but it struggled with typographical errors in data handling. This study highlights the potential of Large language models like ChatGPT-4 to streamline the management of species names in ontologies. Despite certain limitations, these tools offer promising advancements in automating taxonomy-related tasks and improving the efficiency of ontology development.
- South America > Brazil > São Paulo (0.05)
- North America > United States (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- (8 more...)
- Health & Medicine (1.00)
- Information Technology (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Managing FAIR Knowledge Graphs as Polyglot Data End Points: A Benchmark based on the rdf2pg Framework and Plant Biology Data
Brandizi, Marco, Bobed, Carlos, Garulli, Luca, de Klerk, Arné, Hassani-Pak, Keywan
Linked data and labelled property graphs (LPG) are two data management approaches with complementary strengths and weaknesses, making their integration beneficial for sharing datasets and supporting software ecosystems. In thi s paper, we introduce rdf2pg, an extensible framework for mapping RDF data to semantically equivalent LPG formats and databases. Utilising this framework, we perform a comparative analysis of three popular graph databases - Virtuoso, Neo4j, and ArcadeDB - and the well - known graph query languages SPARQL, Cypher, and Gremlin. Our qualitative and quantitative assessments underline the strengths and limitations of these graph database technologies. Additionally, we highlight the potent ial of rdf2pg as a versatile tool for enabling polyglot access to knowledge graphs, aligning with established standards of linked data and the semantic web.
- Asia > Singapore (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs
Markowitz, Elan, Galiya, Krupa, Steeg, Greg Ver, Galstyan, Aram
Knowledge graphs have emerged as a popular method for injecting up-to-date, factual knowledge into large language models (LLMs). This is typically achieved by converting the knowledge graph into text that the LLM can process in context. While multiple methods of encoding knowledge graphs have been proposed, the impact of this textualization process on LLM performance remains under-explored. We introduce KG-LLM-Bench, a comprehensive and extensible benchmark spanning five knowledge graph understanding tasks, and evaluate how different encoding strategies affect performance across various base models. Our extensive experiments with seven language models and five textualization strategies provide insights for optimizing LLM performance on KG reasoning tasks.
- Asia > India > Andhra Pradesh (0.05)
- North America > Guatemala (0.05)
- Asia > South Korea (0.04)
- (9 more...)
An Explainable AI Model for Binary LJ Fluids
Hashmi, Israrul H, Karmakar, Rahul, Maniteja, Marripelli, Ayush, Kumar, Patra, Tarak K.
Lennard-Jones (LJ) fluids serve as an important theoretical framework for understanding molecular interactions. Binary LJ fluids, where two distinct species of particles interact based on the LJ potential, exhibit rich phase behavior and provide valuable insights of complex fluid mixtures. Here we report the construction and utility of an artificial intelligence (AI) model for binary LJ fluids, focusing on their effectiveness in predicting radial distribution functions (RDFs) across a range of conditions. The RDFs of a binary mixture with varying compositions and temperatures are collected from molecular dynamics (MD) simulations to establish and validate the AI model. In this AI pipeline, RDFs are discretized in order to reduce the output dimension of the model. This, in turn, improves the efficacy, and reduce the complexity of an AI RDF model. The model is shown to predict RDFs for many unknown mixtures very accurately, especially outside the training temperature range. Our analysis suggests that the particle size ratio has a higher order impact on the microstructure of a binary mixture. We also highlight the areas where the fidelity of the AI model is low when encountering new regimes with different underlying physics.
- Asia > India (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Related Knowledge Perturbation Matters: Rethinking Multiple Pieces of Knowledge Editing in Same-Subject
Duan, Zenghao, Duan, Wenbin, Yin, Zhiyi, Shen, Yinghan, Jing, Shaoling, Zhang, Jie, Shen, Huawei, Cheng, Xueqi
Knowledge editing has become a promising approach for efficiently and precisely updating knowledge embedded in large language models (LLMs). In this work, we focus on Same-Subject Editing, which involves modifying multiple attributes of a single entity to ensure comprehensive and consistent updates to entity-centric knowledge. Through preliminary observation, we identify a significant challenge: Current state-of-the-art editing methods struggle when tasked with editing multiple related knowledge pieces for the same subject. To address the lack of relevant editing data for identical subjects in traditional benchmarks, we introduce the $\text{S}^2\text{RKE}$(Same-Subject Related Knowledge Editing) benchmark. Our extensive experiments reveal that only mainstream locate-then-edit methods, such as ROME and MEMIT, exhibit "related knowledge perturbation," where subsequent edits interfere with earlier ones. Further analysis reveals that these methods over-rely on subject information, neglecting other critical factors, resulting in reduced editing effectiveness.
- North America > United States (0.05)
- Asia > China > Beijing > Beijing (0.05)
- South America > Argentina (0.04)
- (4 more...)
Entwicklung einer Webanwendung zur Generierung von skolemisierten RDF Daten f\"ur die Verwaltung von Lieferketten
F\"ur eine fr\"uhzeitige Erkennung von Lieferengp\"assen m\"ussen Lieferketten in einer geeigneten digitalen Form vorliegen, damit sie verarbeitet werden k\"onnen. Der f\"ur die Datenmodellierung ben\"otigte Arbeitsaufwand ist jedoch, gerade IT-fremden Personen, nicht zuzumuten. Es wurde deshalb im Rahmen dieser Arbeit eine Webanwendung entwickelt, welche die zugrunde liegende Komplexit\"at f\"ur den Benutzer verschleiern soll. Konkret handelt es sich dabei um eine grafische Benutzeroberfl\"ache, auf welcher Templates instanziiert und miteinander verkn\"upft werden k\"onnen. F\"ur die Definition dieser Templates wurden in dieser Arbeit geeignete Konzepte erarbeitet und erweitert. Zur Erhebung der Benutzerfreundlichkeit der Webanwendung wurde abschlie{\ss}end eine Nutzerstudie mit mehreren Testpersonen durchgef\"uhrt. Diese legte eine Vielzahl von n\"utzlichen Verbesserungsvorschl\"agen offen. -- For early detection of supply bottlenecks, supply chains must be available in a suitable digital form so that they can be processed. However, the amount of work required for data modeling cannot be expected of people who are not familiar with IT topics. Therefore, a web application was developed in the context of this thesis, which is supposed to disguise the underlying complexity for the user. Specifically, this is a graphical user interface on which templates can be instantiated and linked to each other. Suitable concepts for the definition of these templates were developed and extended in this thesis. Finally, a user study with several test persons was conducted to determine the usability of the web application. This revealed a large number of useful suggestions for improvement.
- Europe > Netherlands > Drenthe > Assen (0.24)
- Europe > Germany > Hesse > Darmstadt Region > Wiesbaden (0.04)
Handling irresolvable conflicts in the Semantic Web: an RDF-based conflict-tolerant version of the Deontic Traditional Scheme
Robaldo, Livio, Pozzato, Gianluca
This paper presents a new ontology that implements the well-known Deontic Traditional Scheme in RDFs and SPARQL, fit to handle irresolvable conflicts, i.e., situations in which two or more statements prescribe conflicting obligations, prohibitions, or permissions, with none of them being "stronger" than the other one(s). In our view, this paper marks a significant advancement in standard theoretical research in formal Deontic Logic. Most contemporary approaches in this field are confined to the propositional level, mainly focus on the notion of obligation, and lack implementations. The proposed framework is encoded in RDF, which is not only a first-order language but also the most widely used knowledge representation language, as it forms the foundation of the Semantic Web. Moreover, the proposed computational ontology formalizes all deontic modalities defined in the Deontic Traditional Scheme, without specifically focusing on obligations, and offers constructs to model and reason with various types of irresolvable conflicts, violations, and the interaction between deontic modalities and contextual constraints in a given state of affairs. To the best of our knowledge, no existing approach in the literature addresses all these aspects within a unified integrated framework. All examples presented and discussed in this paper, together with Java code and clear instructions to re-execute them locally, are available at https://github.com/liviorobaldo/conflict-tolerantDeonticTraditionalScheme
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Netherlands > South Holland > Dordrecht (0.04)
- (4 more...)
- Law (1.00)
- Government (0.68)
- Transportation > Ground > Road (0.46)