Goto

Collaborating Authors

 Ontologies


The Heritage Digital Twin: a bicycle made for two. The integration of digital methodologies into cultural heritage research

arXiv.org Artificial Intelligence

According to the authors, such integration is like riding a bicycle made for two, also known as a tandem. This kind of vehicle requires a strong collaboration between the two riders to pedal synchronically and the one in front must be able and willing to drive the tandem towards a common destination, on which both riders agree. The structure of the bicycle should suit a diversity of users: tall and short; married couples and perfect strangers; sportspeople and lazy ones. The way it can be used must adapt to any kind of road, dirt trails and urban well-paved streets alike. Cycling metaphors aside, the convergence and integration of two different disciplines puts requirements to the method and the attitude of both and of all participants.


NFRsTDO v1.2's Terms, Properties, and Relationships -- A Top-Domain Non-Functional Requirements Ontology

arXiv.org Artificial Intelligence

This preprint specifies and defines all the Terms, Properties, and Relationships of NFRsTDO (Non-Functional Requirements Top-Domain Ontology). NFRsTDO v1.2, whose UML conceptualization is shown in Figure 1 is a slightly updated version of its predecessor, namely NFRsTDO v1.1. NFRsTDO is an ontology mainly devoted to quality (non-functional) requirements and quality/cost views, which is placed at the top-domain level in the context of a multilayer ontological architecture called FCD-OntoArch (Foundational, Core, Domain, and instance Ontological Architecture for sciences). Figure 2 depicts its five tiers, which entail Foundational, Core, Top-Domain, Low-Domain, and Instance. Each level is populated with ontological components or, in other words, ontologies. Ontologies at the same level can be related to each other, except at the foundational level, where only ThingFO (Thing Foundational Ontology) is found. In addition, ontologies' terms and relationships at lower levels can be semantically enriched by ontologies' terms and relationships from the higher levels. NFRsTDO's terms and relationships are mainly extended/reused from ThingFO, SituationCO (Situation Core Ontology), ProcessCO (Process Core Ontology), and FRsTDO (Functional Requirements Top-Domain Ontology). Stereotypes are the used mechanism for enriching NFRsTDO terms. Note that annotations of updates from the previous version (NFRsTDO v1.1) to the current one (v1.2) can be found in Appendix A.


Tab2KG: Semantic Table Interpretation with Lightweight Semantic Profiles

arXiv.org Artificial Intelligence

Tabular data plays an essential role in many data analytics and machine learning tasks. Typically, tabular data does not possess any machine-readable semantics. In this context, semantic table interpretation is crucial for making data analytics workflows more robust and explainable. This article proposes Tab2KG - a novel method that targets at the interpretation of tables with previously unseen data and automatically infers their semantics to transform them into semantic data graphs. We introduce original lightweight semantic profiles that enrich a domain ontology's concepts and relations and represent domain and table characteristics. We propose a one-shot learning approach that relies on these profiles to map a tabular dataset containing previously unseen instances to a domain ontology. In contrast to the existing semantic table interpretation approaches, Tab2KG relies on the semantic profiles only and does not require any instance lookup. This property makes Tab2KG particularly suitable in the data analytics context, in which data tables typically contain new instances. Our experimental evaluation on several real-world datasets from different application domains demonstrates that Tab2KG outperforms state-of-the-art semantic table interpretation baselines.


KG-Hub -- Building and Exchanging Biological Knowledge Graphs

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of knowledge graphs is lacking. Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of knowledge graphs. Features include a simple, modular extract-transform-load (ETL) pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate knowledge graphs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph machine learning, including node embeddings and training of models for link prediction and node classification.


Semantic rule Web-based Diagnosis and Treatment of Vector-Borne Diseases using SWRL rules

arXiv.org Artificial Intelligence

Vector-borne diseases (VBDs) are a kind of infection caused through the transmission of vectors generated by the bites of infected parasites, bacteria, and viruses, such as ticks, mosquitoes, triatomine bugs, blackflies, and sandflies. If these diseases are not properly treated within a reasonable time frame, the mortality rate may rise. In this work, we propose a set of ontologies that will help in the diagnosis and treatment of vector-borne diseases. For developing VBD's ontology, electronic health records taken from the Indian Health Records website, text data generated from Indian government medical mobile applications, and doctors' prescribed handwritten notes of patients are used as input. This data is then converted into correct text using Optical Character Recognition (OCR) and a spelling checker after pre-processing. Natural Language Processing (NLP) is applied for entity extraction from text data for making Resource Description Framework (RDF) medical data with the help of the Patient Clinical Data (PCD) ontology. Afterwards, Basic Formal Ontology (BFO), National Vector Borne Disease Control Program (NVBDCP) guidelines, and RDF medical data are used to develop ontologies for VBDs, and Semantic Web Rule Language (SWRL) rules are applied for diagnosis and treatment. The developed ontology helps in the construction of decision support systems (DSS) for the NVBDCP to control these diseases.


Forecasting COVID- 19 cases using Statistical Models and Ontology-based Semantic Modelling: A real time data analytics approach

arXiv.org Artificial Intelligence

SARS-COV-19 is the most prominent issue which many countries face today. The frequent changes in infections, recovered and deaths represents the dynamic nature of this pandemic. It is very crucial to predict the spreading rate of this virus for accurate decision making against fighting with the situation of getting infected through the virus, tracking and controlling the virus transmission in the community. We develop a prediction model using statistical time series models such as SARIMA and FBProphet to monitor the daily active, recovered and death cases of COVID-19 accurately. Then with the help of various details across each individual patient (like height, weight, gender etc.), we designed a set of rules using Semantic Web Rule Language and some mathematical models for dealing with COVID-19 infected cases on an individual basis. After combining all the models, a COVID-19 Ontology is developed and performs various queries using SPARQL query on designed Ontology which accumulate the risk factors, provide appropriate diagnosis, precautions and preventive suggestions for COVID Patients. After comparing the performance of SARIMA and FBProphet, it is observed that the SARIMA model performs better in forecasting of COVID cases.


Pinaki Laskar on LinkedIn: #aitechnology #dataontology #dataengineering #datascience…

#artificialintelligence

How Real World Ontology can help us in the Data Science World of AI Technology? The World Data Ontology could serve as the Single Source/Point of Truth (SSOT/SPOT), Knowledge and Intelligence, Human and Machine. It could be applied as the world model engine for intelligence, learning, inference, decision-making, complex problem-solving and interaction of man-machine superintelligence, innovated as Trans-AI or Meta-AI. Global Data Ontology (GDO) is the prime Single Source/Point of Truth (SSOT/SPOT). The single source of truth, knowledge and intelligence is the world and its data universe, with its causal entities, forces, relationships, principles, mechanisms, laws and regularities.


Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

arXiv.org Artificial Intelligence

Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. Objective: We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Results: Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. Conclusions: By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.


Finite Materialisability of Datalog Programs with Metric Temporal Operators

Journal of Artificial Intelligence Research

DatalogMTL is an extension of Datalog with metric temporal operators that has recently found applications in stream reasoning and temporal ontology-based data access. In contrast to plain Datalog, where materialisation (a.k.a. forward chaining) naturally terminates in finitely many steps, reaching a fixpoint in DatalogMTL may require infinitely many rounds of rule applications. As a result, existing reasoning systems resort to other approaches, such as constructing large Büchi automata, whose implementations turn out to be highly inefficient in practice. In this paper, we propose and study finitely materialisable DatalogMTL programs, for which forward chaining reasoning is guaranteed to terminate. We consider a data-dependent notion of finite materialisability of a program, where termination is guaranteed for a given dataset, as well as a data-independent notion, where termination is guaranteed regardless of the dataset. We show that, for bounded programs (a natural DatalogMTL fragment for which reasoning is as hard as in the full language), checking data-dependent finite materialisability is ExpSpace-complete in combined complexity and PSpace-complete in data complexity; furthermore, we propose a practical materialisation-based decision procedure that works in doubly exponential time. We show that checking data-independent finite materialisability for bounded progams is computationally easier, namely ExpTime-complete; moreover, we propose sufficient conditions for data-indenpendent finite materialisability that can be efficiently checked. We provide also the complexity landscape of fact entailment for different classes of finitely materialisable programs; surprisingly, we could identify a large class of finitely materialisable programs, called MTL-acyclic programs, for which fact entailment has exactly the same data and combined complexity as in plain Datalog, which makes this fragment especially well suited for big-scale applications.


Using Knowledge Graphs for Performance Prediction of Modular Optimization Algorithms

arXiv.org Artificial Intelligence

Empirical data plays an important role in evolutionary computation research. To make better use of the available data, ontologies have been proposed in the literature to organize their storage in a structured way. However, the full potential of these formal methods to capture our domain knowledge has yet to be demonstrated. In this work, we evaluate a performance prediction model built on top of the extension of the recently proposed OPTION ontology. More specifically, we first extend the OPTION ontology with the vocabulary needed to represent modular black-box optimization algorithms. Then, we use the extended OPTION ontology, to create knowledge graphs with fixed-budget performance data for two modular algorithm frameworks, modCMA, and modDE, for the 24 noiseless BBOB benchmark functions. We build the performance prediction model using a knowledge graph embedding-based methodology. Using a number of different evaluation scenarios, we show that a triple classification approach, a fairly standard predictive modeling task in the context of knowledge graphs, can correctly predict whether a given algorithm instance will be able to achieve a certain target precision for a given problem instance. This approach requires feature representation of algorithms and problems. While the latter is already well developed, we hope that our work will motivate the community to collaborate on appropriate algorithm representations.