xsd
Adaptive and Multi-Source Entity Matching for Name Standardization of Astronomical Observation Facilities
Fretel, Liza, Cecconi, Baptiste, Debisschop, Laura
This ongoing work focuses on the development of a methodology for generating a multi-source mapping of astronomical observation facilities. To compare two entities, we compute scores with adaptable criteria and Natural Language Processing (NLP) techniques (Bag-of-Words approaches, sequential approaches, and surface approaches) to map entities extracted from eight semantic artifacts, including Wikidata and astronomy-oriented resources. We utilize every property available, such as labels, definitions, descriptions, external identifiers, and more domain-specific properties, such as the observation wavebands, spacecraft launch dates, funding agencies, etc. Finally, we use a Large Language Model (LLM) to accept or reject a mapping suggestion and provide a justification, ensuring the plausibility and FAIRness of the validated synonym pairs. The resulting mapping is composed of multi-source synonym sets providing only one standardized label per entity. Those mappings will be used to feed our Name Resolver API and will be integrated into the International Virtual Observatory Alliance (IVOA) Vocabularies and the OntoPortal-Astro platform.
- North America > United States (0.15)
- South America > Chile (0.04)
- Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
- (2 more...)
A Diagnosis and Treatment of Liver Diseases: Integrating Batch Processing, Rule-Based Event Detection and Explainable Artificial Intelligence
Chandra, Ritesh, Tiwari, Sadhana, Rastogi, Satyam, Agarwal, Sonali
Liver diseases pose a significant global health burden, impacting many individuals and having substantial economic and social consequences. Rising liver problems are considered a fatal disease in many countries, such as Egypt and Moldova. This study aims to develop a diagnosis and treatment model for liver disease using Basic Formal Ontology (BFO), Patient Clinical Data (PCD) ontology, and detection rules derived from a decision tree algorithm. For the development of the ontology, the National Viral Hepatitis Control Program (NVHCP) guidelines were used, which made the ontology more accurate and reliable. The Apache Jena framework uses batch processing to detect events based on these rules. Based on the event detected, queries can be directly processed using SPARQL. We convert these Decision Tree (DT) and medical guidelines-based rules into Semantic Web Rule Language (SWRL) to operationalize the ontology. Using this SWRL in the ontology to predict different types of liver disease with the help of the Pellet and Drools inference engines in Protege Tools, a total of 615 records were taken from different liver diseases. After inferring the rules, the result can be generated for the patient according to the rules, and other patient-related details, along with different precautionary suggestions, can be obtained based on these results. These rules can make suggestions more accurate with the help of Explainable Artificial Intelligence (XAI) with open API-based suggestions. When the patient has prescribed a medical test, the model accommodates this result using optical character recognition (OCR), and the same process applies when the patient has prescribed a further medical suggestion according to the test report. These models combine to form a comprehensive Decision Support System (DSS) for the diagnosis of liver disease.
- Europe > Moldova (0.24)
- Africa > Middle East > Egypt (0.24)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (4 more...)
Open Digital Rights Enforcement Framework (ODRE): from descriptive to enforceable policies
Cimmino, Andrea, Cano-Benito, Juan, García-Castro, Raúl
From centralised platforms to decentralised ecosystems, like Data Spaces, sharing data has become a paramount challenge. For this reason, the definition of data usage policies has become crucial in these domains, highlighting the necessity of effective policy enforcement mechanisms. The Open Digital Rights Language (ODRL) is a W3C standard ontology designed to describe data usage policies, however, it lacks built-in enforcement capabilities, limiting its practical application. This paper introduces the Open Digital Rights Enforcement (ODRE) framework, whose goal is to provide ODRL with enforcement capabilities. The ODRE framework proposes a novel approach to express ODRL policies that integrates the descriptive ontology terms of ODRL with other languages that allow behaviour specification, such as dynamic data handling or function evaluation. The framework includes an enforcement algorithm for ODRL policies and two open-source implementations in Python and Java. The ODRE framework is also designed to support future extensions of ODRL to specific domain scenarios. In addition, current limitations of ODRE, ODRL, and current challenges are reported. Finally, to demonstrate the enforcement capabilities of the implementations, their performance, and their extensibility features, several experiments have been carried out with positive results.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- Europe > Italy (0.04)
Linked Papers With Code: The Latest in Machine Learning as an RDF Knowledge Graph
Färber, Michael, Lamprecht, David
In this paper, we introduce Linked Papers With Code (LPWC), an RDF knowledge graph that provides comprehensive, current information about almost 400,000 machine learning publications. This includes the tasks addressed, the datasets utilized, the methods implemented, and the evaluations conducted, along with their results. Compared to its non-RDF-based counterpart Papers With Code, LPWC not only translates the latest advancements in machine learning into RDF format, but also enables novel ways for scientific impact quantification and scholarly key content recommendation. LPWC is openly accessible at https://linkedpaperswithcode.com and is licensed under CC-BY-SA 4.0. As a knowledge graph in the Linked Open Data cloud, we offer LPWC in multiple formats, from RDF dump files to a SPARQL endpoint for direct web queries, as well as a data source with resolvable URIs and links to the data sources SemOpenAlex, Wikidata, and DBLP. Additionally, we supply knowledge graph embeddings, enabling LPWC to be readily applied in machine learning applications.
- Europe > Greece > Attica > Athens (0.04)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
SemOpenAlex: The Scientific Landscape in 26 Billion RDF Triples
Färber, Michael, Lamprecht, David, Krause, Johan, Aung, Linn, Haase, Peter
We present SemOpenAlex, an extensive RDF knowledge graph that contains over 26 billion triples about scientific publications and their associated entities, such as authors, institutions, journals, and concepts. SemOpenAlex is licensed under CC0, providing free and open access to the data. We offer the data through multiple channels, including RDF dump files, a SPARQL endpoint, and as a data source in the Linked Open Data cloud, complete with resolvable URIs and links to other data sources. Moreover, we provide embeddings for knowledge graph entities using high-performance computing. SemOpenAlex enables a broad range of use-case scenarios, such as exploratory semantic search via our website, large-scale scientific impact quantification, and other forms of scholarly big data analytics within and across scientific disciplines. Additionally, it enables academic recommender systems, such as recommending collaborators, publications, and venues, including explainability capabilities. Finally, SemOpenAlex can serve for RDF query optimization benchmarks, creating scholarly knowledge-guided language models, and as a hub for semantic scientific publishing.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
The Pros and Cons of RDF-Star and Sparql-Star
For regular readers of the (lately somewhat irregularly published) The Cagle Report, I've finally managed to get my feet underneath me at Data Science Central, and am gearing up with a number of new initiatives, including a video interview program that I'm getting underway as soon as I can get the last of the physical infrastructure (primarily some lighting and a decent green screen) in place. I recently purchased a new laptop one with enough speed and space to let me do any number of projects that my nearly four-year-old workhorse was just not equipped to handle. One of those projects was to start going through the dominant triple stores and explore them in greater depth as part of a general evaluation I hope to complete later in the year. The latest Ontotext GraphDB (9.7.0) had been on my list for a while, and I was generally surprised and pleased by what I found there, especially as I'd worked with older versions of GraphDB and found it useful but not quite there. These four items have become what I consider essential technologies for a W3C stack triple store to fully implement.
Schemaless Queries over Document Tables with Dependencies
Canim, Mustafa, Cornelio, Cristina, Iyengar, Arun, Musa, Ryan, Muro, Mariano Rodrigez
Unstructured enterprise data such as reports, manuals and guidelines often contain tables. The traditional way of integrating data from these tables is through a two-step process of table detection/extraction and mapping the table layouts to an appropriate schema. This can be an expensive process. In this paper we show that by using semantic technologies (RDF/SPARQL and database dependencies) paired with a simple but powerful way to transform tables with non-relational layouts, it is possible to offer query answering services over these tables with minimal manual work or domain-specific mappings. Our method enables users to exploit data in tables embedded in documents with little effort, not only for simple retrieval queries, but also for structured queries that require joining multiple interrelated tables.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Research Report (0.64)
- Workflow (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.89)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
- (2 more...)
A Semantic Infrastructure for Personalisable Context-Aware Environments
Scerri, Simon (Fraunhofer IAIS and University of Bonn) | Debattista, Jeremy (University of Bonn) | Attard, Judie (University of Bonn) | Rivera, Ismael (Altocloud)
Although a number of initiatives provide personalized context-aware guidance for niche use-cases, a standard framework for context awareness remains lacking. This article explains how semantic technology has been exploited to generate a centralized repository of personal activity context. This data drives advanced features such as, personal situation recognition and customizable rules for the context-sensitive management of personal devices and data sharing. As a proof-of-concept, we demonstrate how an innovative context-aware system has successfully adopted such an infrastructure.
- Europe > Middle East > Malta > Northern Region > Western District > Attard (0.05)
- North America > United States > Florida > Pinellas County > Largo (0.04)
- Europe > Spain > Galicia > Madrid (0.04)
- (9 more...)
Using Description Logics for RDF Constraint Checking and Closed-World Recognition
Patel-Schneider, Peter F. (Nuance Commmunications)
RDF and Description Logics work in an open-world setting where absence of information is not information about absence. Nevertheless, Description Logic axioms can be interpreted in a closed-world setting and in this setting they can be used for both constraint checking and closed-world recognition against information sources. When the information sources are expressed in well-behaved RDF or RDFS (i.e., RDF graphs interpreted in the RDF or RDFS semantics) this constraint checking and closed-world recognition is simple to describe. Further this constraint checking can be implemented as SPARQL querying and thus effectively performed.
XML Matchers: approaches and challenges
Agreste, Santa, De Meo, Pasquale, Ferrara, Emilio, Ursino, Domenico
Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.
- Europe > Austria > Vienna (0.14)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- North America > United States > California > Santa Clara County > San Jose (0.04)
- (39 more...)
- Summary/Review (1.00)
- Research Report (1.00)
- Overview (1.00)
- Information Technology > Information Management (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Web (1.00)
- (7 more...)