Collaborating Authors

Semantic Web

Development of Semantic Web-based Imaging Database for Biological Morphome Artificial Intelligence

We introduce the RIKEN Microstructural Imaging Metadatabase, a semantic web-based imaging database in which image metadata are described using the Resource Description Framework (RDF) and detailed biological properties observed in the images can be represented as Linked Open Data. The metadata are used to develop a large-scale imaging viewer that provides a straightforward graphical user interface to visualise a large microstructural tiling image at the gigabyte level. We applied the database to accumulate comprehensive microstructural imaging data produced by automated scanning electron microscopy. As a result, we have successfully managed vast numbers of images and their metadata, including the interpretation of morphological phenotypes occurring in sub-cellular components and biosamples captured in the images. We also discuss advanced utilisation of morphological imaging data that can be promoted by this database.

An Exploratory Study on Utilising the Web of Linked Data for Product Data Mining Artificial Intelligence

The Linked Open Data practice has led to a significant growth of structured data on the Web in the last decade. Such structured data describe real-world entities in a machine-readable way, and have created an unprecedented opportunity for research in the field of Natural Language Processing. However, there is a lack of studies on how such data can be used, for what kind of tasks, and to what extent they can be useful for these tasks. This work focuses on the e-commerce domain to explore methods of utilising such structured data to create language resources that may be used for product classification and linking. We process billions of structured data points in the form of RDF n-quads, to create multi-million words of product-related corpora that are later used in three different ways for creating of language resources: training word embedding models, continued pre-training of BERT-like language models, and training Machine Translation models that are used as a proxy to generate product-related keywords. Our evaluation on an extensive set of benchmarks shows word embeddings to be the most reliable and consistent method to improve the accuracy on both tasks (with up to 6.9 percentage points in macro-average F1 on some datasets). The other two methods however, are not as useful. Our analysis shows that this could be due to a number of reasons, including the biased domain representation in the structured data and lack of vocabulary coverage. We share our datasets and discuss how our lessons learned could be taken forward to inform future research in this direction.

An Internet of Things Service Roadmap

Communications of the ACM

The Internet of things (IoT) is taking the world by storm, thanks to the proliferation of sensors and actuators embedded in everyday things, coupled with the wide availability of high-speed Internet50 and evolution of the 5th-generation (5G) networks.34 IoT devices are increasingly supplying information about the physical environment (for example, infrastructure, assets, homes, and cars). The advent of IoT is enabling not only the connection and integration of devices that monitor physical world phenomena (for example, temperature, pollution, energy consumption, human activities, and movement), but also data-driven and AI-augmented intelligence. At all levels, synergies from advances in IoT, data analytics, and artificial intelligence (AI) are firmly recognized as strategic priorities for digital transformation.10,41,50 IoT poses two key challenges:36 Communication with things and management of things.41 The service paradigm is a key mechanism to overcome these challenges by transforming IoT devices into IoT services, where they will be treated as first-class objects through the prism of services.9 In a nutshell, services are at a higher level of abstraction than data. Services descriptions consist of two parts: functional and non-functional, such as, Quality of Service (QoS) attributes.27 Services often transform data into an actionable knowledge or achieve physical state changes in the operating context.9 As a result, the service paradigm is the perfect basis for understanding the transformation of data into actionable knowledge, that is, making it useful. Despite the increasing uptake of IoT services, most organizations have not yet mastered the requisite knowledge, skills, or understanding to craft a successful IoT strategy.

A Study of the Quality of Wikidata Artificial Intelligence

Wikidata has been increasingly adopted by many communities for a wide variety of applications, which demand high-quality knowledge to deliver successful results. In this paper, we develop a framework to detect and analyze low-quality statements in Wikidata by shedding light on the current practices exercised by the community. We explore three indicators of data quality in Wikidata, based on: 1) community consensus on the currently recorded knowledge, assuming that statements that have been removed and not added back are implicitly agreed to be of low quality; 2) statements that have been deprecated; and 3) constraint violations in the data. We combine these indicators to detect low-quality statements, revealing challenges with duplicate entities, missing triples, violated type rules, and taxonomic distinctions. Our findings complement ongoing efforts by the Wikidata community to improve data quality, aiming to make it easier for users and editors to find and correct mistakes.

The I-ADOPT Interoperability Framework for FAIRer data descriptions of biodiversity Artificial Intelligence

Biodiversity, the variation within and between species and ecosystems, is essential for human well-being and the equilibrium of the planet. It is critical for the sustainable development of human society and is an important global challenge. Biodiversity research has become increasingly data-intensive and it deals with heterogeneous and distributed data made available by global and regional initiatives, such as GBIF, ILTER, LifeWatch, BODC, PANGAEA, and TERN, that apply different data management practices. In particular, a variety of metadata and semantic resources have been produced by these initiatives to describe biodiversity observations, introducing interoperability issues across data management systems. To address these challenges, the InteroperAble Descriptions of Observable Property Terminology WG (I-ADOPT WG) was formed by a group of international terminology providers and data center managers in 2019 with the aim to build a common approach to describe what is observed, measured, calculated, or derived. Based on an extensive analysis of existing semantic representations of variables, the WG has recently published the I-ADOPT framework ontology to facilitate interoperability between existing semantic resources and support the provision of machine-readable variable descriptions whose components are mapped to FAIR vocabulary terms. The I-ADOPT framework ontology defines a set of high level semantic components that can be used to describe a variety of patterns commonly found in scientific observations. This contribution will focus on how the I-ADOPT framework can be applied to represent variables commonly used in the biodiversity domain.

Knowledge Graphs and Machine Learning in biased C4I applications Artificial Intelligence

This paper introduces our position on the critical issue of bias that recently appeared in AI applications. Specifically, we discuss the combination of current technologies used in AI applications i.e., Machine Learning and Knowledge Graphs, and point to their involvement in (de)biased applications of the C4I domain. Although this is a wider problem that currently emerges from different application domains, bias appears more critical in C4I than in others due to its security-related nature. While proposing certain actions to be taken towards debiasing C4I applications, we acknowledge the immature aspect of this topic within the Knowledge Graph and Semantic Web communities.

Finding Experts in Social Media Data using a Hybrid Approach Artificial Intelligence

Several approaches to the problem of expert finding have emerged in computer science research. In this work, three of these approaches - content analysis, social graph analysis and the use of Semantic Web technologies are examined. An integrated set of system requirements is then developed that uses all three approaches in one hybrid approach. To show the practicality of this hybrid approach, a usable prototype expert finding system called ExpertQuest is developed using a modern functional programming language (Clojure) to query social media data and Linked Data. This system is evaluated and discussed. Finally, a discussion and conclusions are presented which describe the benefits and shortcomings of the hybrid approach and the technologies used in this work.

DeepCube H2020 - DeepCube Project - European H2020 framework program


Welcome to DeepCube – a Horizon 2020 Space project that will unlock the potential of big Copernicus data with Artificial Intelligence and Semantic Web technologies, with the objective to address problems of high environmental and societal impact. Taken from the Coast Guard helicopter. The southern end of the lava flow is about 2.6 km from Suðurstrandarvegur. According to initial information, the fissure is about 200 m long. The website of the EU project DeepCube is up and it looks amazing!

Using a Personal Health Library-Enabled mHealth Recommender System for Self-Management of Diabetes Among Underserved Populations: Use Case for Knowledge Graphs and Linked Data Artificial Intelligence

Personal health libraries (PHLs) provide a single point of secure access to patients digital health data and enable the integration of knowledge stored in their digital health profiles with other sources of global knowledge. PHLs can help empower caregivers and health care providers to make informed decisions about patients health by understanding medical events in the context of their lives. This paper reports the implementation of a mobile health digital intervention that incorporates both digital health data stored in patients PHLs and other sources of contextual knowledge to deliver tailored recommendations for improving self-care behaviors in diabetic adults. We conducted a thematic assessment of patient functional and nonfunctional requirements that are missing from current EHRs based on evidence from the literature. We used the results to identify the technologies needed to address those requirements. We describe the technological infrastructures used to construct, manage, and integrate the types of knowledge stored in the PHL. We leverage the Social Linked Data (Solid) platform to design a fully decentralized and privacy-aware platform that supports interoperability and care integration. We provided an initial prototype design of a PHL and drafted a use case scenario that involves four actors to demonstrate how the proposed prototype can be used to address user requirements, including the construction and management of the PHL and its utilization for developing a mobile app that queries the knowledge stored and integrated into the PHL in a private and fully decentralized manner to provide better recommendations. The proposed PHL helps patients and their caregivers take a central role in making decisions regarding their health and equips their health care providers with informatics tools that support the collection and interpretation of the collected knowledge.

Intelligent Software Web Agents: A Gap Analysis Artificial Intelligence

Semantic web technologies have shown their effectiveness, especially when it comes to knowledge representation, reasoning, and data integrations. However, the original semantic web vision, whereby machine readable web data could be automatically actioned upon by intelligent software web agents, has yet to be realised. In order to better understand the existing technological challenges and opportunities, in this paper we examine the status quo in terms of intelligent software web agents, guided by research with respect to requirements and architectural components, coming from that agents community. We start by collating and summarising requirements and core architectural components relating to intelligent software agent. Following on from this, we use the identified requirements to both further elaborate on the semantic web agent motivating use case scenario, and to summarise different perspectives on the requirements when it comes to semantic web agent literature. Finally, we propose a hybrid semantic web agent architecture, discuss the role played by existing semantic web standards, and point to existing work in the broader semantic web community any beyond that could help us to make the semantic web agent vision a reality.