Mosca, Alessandro
INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]
Amer-Yahia, Sihem, Koutrika, Georgia, Bastian, Frederic, Belmpas, Theofilos, Braschler, Martin, Brunner, Ursin, Calvanese, Diego, Fabricius, Maximilian, Gkini, Orest, Kosten, Catherine, Lanti, Davide, Litke, Antonis, Lücke-Tieke, Hendrik, Massucci, Francesco Alessandro, de Farias, Tarcisio Mendes, Mosca, Alessandro, Multari, Francesco, Papadakis, Nikolaos, Papadopoulos, Dimitris, Patil, Yogendra, Personnaz, Aurélien, Rull, Guillem, Sima, Ana, Smith, Ellery, Skoutas, Dimitrios, Subramanian, Srividya, Xiao, Guohui, Stockinger, Kurt
A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.
Mapping Patterns for Virtual Knowledge Graphs
Calvanese, Diego, Gal, Avigdor, Lanti, Davide, Montali, Marco, Mosca, Alessandro, Shraga, Roee
Virtual Knowledge Graphs (VKG) constitute one of the most promising paradigms for integrating and accessing legacy data sources. A critical bottleneck in the integration process involves the definition, validation, and maintenance of mappings that link data sources to a domain ontology. To support the management of mappings throughout their entire lifecycle, we propose a comprehensive catalog of sophisticated mapping patterns that emerge when linking databases to ontologies. To do so, we build on well-established methodologies and patterns studied in data management, data analysis, and conceptual modeling. These are extended and refined through the analysis of concrete VKG benchmarks and real-world use cases, and considering the inherent impedance mismatch between data sources and ontologies. We validate our catalog on the considered VKG scenarios, showing that it covers the vast majority of patterns present therein.