Goto

Collaborating Authors

 connaissance


Animer une base de connaissance: des ontologies aux mod{è}les d'I.A. g{é}n{é}rative

Stockinger, Peter

arXiv.org Artificial Intelligence

Animating a Knowledge Base: From Ontologies to Generative AI Models From Expert Systems and the Semantic W eb to Generative AI: Model - Driven and Data - Driven Approaches in Area Studies In a context where the social sciences and humanities are experimenting with non - anthropocentric analytical frames, this article proposes a semiotic (structural) reading of the hybridization between symbolic AI and neural (or sub - symbolic) AI based on a field of application: the design and use of a knowledge base for area studies. W e describe the LaCAS ecosystem - Open Archives in Linguistic and Cultural Studies (thesaurus; RDF/OWL ontology; LOD services; harvesting; expertise; publication), deployed at Inalco (National Institute for Oriental Languages and Civilizations) in Paris with the Okapi (Open Knowledge and Annotation Interface) software environment from Ina (National Audiovisual Institute), which now has around 160,000 documentary r esources and ten knowledge macro - domains grouping together several thousand knowledge objects. W e illustrate this approach using the knowledge domain "Languages of the world" (~540 languages) and the knowledge object "Quechua (language)". On this basis, we discuss the controlled integration of neural tools, more specifically generative tools, into the life cycle of a knowledge base: assistance with data localization/qualification, index extraction and aggregation, property suggestion and testing, dynamic file generation, and engineering of contextualized prompts (generic, contextual, explanatory, adjustment, procedural) aligned with a domain ontology. W e outline an ecosystem of specialized agents capable of animating the database while respe cting its symbolic constraints, by articulating model - driven and data - driven methods .


From Conceptual Data Models to Multimodal Representation

Stockinger, Peter

arXiv.org Artificial Intelligence

1) Introduction and Conceptual Framework: This document explores the concept of information design by dividing it into two major practices: defining the meaning of a corpus of textual data and its visual or multimodal representation. It draws on expertise in enriching textual corpora, particularly audiovisual ones, and transforming them into multiple narrative formats. The text highlights a crucial distinction between the semantic content of a domain and the modalities of its graphic expression, illustrating this approach with concepts rooted in structural semiotics and linguistics traditions. 2) Modeling and Conceptual Design: The article emphasizes the importance of semantic modeling, often achieved through conceptual networks or graphs. These tools enable the structuring of knowledge within a domain by accounting for relationships between concepts, contexts of use, and specific objectives. Stockinger also highlights the constraints and challenges involved in creating dynamic and adaptable models, integrating elements such as thesauri or interoperable ontologies to facilitate the analysis and publication of complex corpora. 3) Applications and Multimodal Visualization: The text concludes by examining the practical application of these models in work environments like OKAPI, developed to analyze, publish, and reuse audiovisual data. It also discusses innovative approaches such as visual storytelling and document reengineering, which involve transforming existing content into new resources tailored to various contexts. These methods emphasize interoperability, flexibility, and the intelligence of communication systems, paving the way for richer and more collaborative use of digital data. The content of this document was presented during the "Semiotics of Information Design" Day organized by Anne Beyaert-Geslin of the University of Bordeaux Montaigne (MICA laboratory) on June 21, 2018, in Bordeaux.


Implementing a hybrid approach in a knowledge engineering process to manage technical advice relating to feedback from the operation of complex sensitive equipment

Berger, Alain Claude Hervé, Boblet, Sébastien, Cartié, Thierry, Cotton, Jean-Pierre, Vexler, François

arXiv.org Artificial Intelligence

How can technical advice on operating experience feedback be managed efficiently in an organization that has never used knowledge engineering techniques and methods? This article explains how an industrial company in the nuclear and defense sectors adopted such an approach, adapted to its "TA KM" organizational context and falls within the ISO30401 framework, to build a complete system with a "SARBACANES" application to support its business processes and perpetuate its know-how and expertise in a knowledge base. Over and above the classic transfer of knowledge between experts and business specialists, SARBACANES also reveals the ability of this type of engineering to deliver multi-functional operation. Modeling was accelerated by the use of a tool adapted to this type of operation: the Ardans Knowledge Maker platform.


L'explicabilit\'e au service de l'extraction de connaissances : application \`a des donn\'ees m\'edicales

Cugny, Robin, Doumard, Emmanuel, Escriva, Elodie, Wang, Haomiao

arXiv.org Artificial Intelligence

The use of machine learning has increased dramatically in the last decade. The lack of transparency is now a limiting factor, which the field of explainability wants to address. Furthermore, one of the challenges of data mining is to present the statistical relationships of a dataset when they can be highly non-linear. One of the strengths of supervised learning is its ability to find complex statistical relationships that explainability allows to represent in an intelligible way. This paper shows that explanations can be used to extract knowledge from data and shows how feature selection, data subgroup analysis and selection of highly informative instances benefit from explanations. We then present a complete data processing pipeline using these methods on medical data. -- -- L'utilisation de l'apprentissage automatique a connu un bond cette derni\`ere d\'ecennie. Le manque de transparence est aujourd'hui un frein, que le domaine de l'explicabilit\'e veut r\'esoudre. Par ailleurs, un des d\'efis de l'exploration de donn\'ees est de pr\'esenter les relations statistiques d'un jeu de donn\'ees alors que celles-ci peuvent \^etre hautement non-lin\'eaires. Une des forces de l'apprentissage supervis\'e est sa capacit\'e \`a trouver des relations statistiques complexes que l'explicabilit\'e permet de repr\'esenter de mani\`ere intelligible. Ce papier montre que les explications permettent de faire de l'extraction de connaissance sur des donn\'ees et comment la s\'election de variables, l'analyse de sous-groupes de donn\'ees et la s\'election d'instances avec un fort pouvoir informatif b\'en\'eficient des explications. Nous pr\'esentons alors un pipeline complet de traitement des donn\'ees utilisant ces m\'ethodes pour l'exploration de donn\'ees m\'edicales.


D\'ecouvrir de nouvelles classes dans des donn\'ees tabulaires

Troisemaine, Colin, Flocon-Cholet, Joachim, Gosselin, Stéphane, Vaton, Sandrine, Reiffers-Masson, Alexandre, Lemaire, Vincent

arXiv.org Artificial Intelligence

In Novel Class Discovery (NCD), the goal is to find new classes in an unlabeled set given a labeled set of known but different classes. While NCD has recently gained attention from the community, no framework has yet been proposed for heterogeneous tabular data, despite being a very common representation of data. In this paper, we propose TabularNCD, a new method for discovering novel classes in tabular data. We show a way to extract knowledge from already known classes to guide the discovery process of novel classes in the context of tabular data which contains heterogeneous variables. A part of this process is done by a new method for defining pseudo labels, and we follow recent findings in Multi-Task Learning to optimize a joint objective function. Our method demonstrates that NCD is not only applicable to images but also to heterogeneous tabular data.


Acquisition and Representation of User Preferences Guided by an Ontology

Dandan, Rahma, Despres, Sylvie, Sedki, Karima

arXiv.org Artificial Intelligence

Our food preferences guide our food choices and in turn affect our personal health and our social life. In this paper, we adopt an approach using a domain ontology expressed in OWL2 to support the acquisition and representation of preferences in formalism CP-Net. Specifically, we present the construction of the domain ontology and questionnaire design to acquire and represent the preferences. The acquisition and representation of preferences are implemented in the field of university canteen. Our main contribution in this preliminary work is to acquire preferences and enrich the model preferably with domain knowledge represented in the ontology.


A Novel Approach for Generating SPARQL Queries from RDF Graphs

Jabri, Emna

arXiv.org Artificial Intelligence

This work is done as part of a research master's thesis project. The goal is to generate SPARQL queries based on user-supplied keywords to query RDF graphs. To do this, we first transformed the input ontology into an RDF graph that reflects the semantics represented in the ontology. Subsequently, we stored this RDF graph in the Neo4j graphical database to ensure efficient and persistent management of RDF data. At the time of the interrogation, we studied the different possible and desired interpretations of the request originally made by the user. We have also proposed to carry out a sort of transformation between the two query languages SPARQL and Cypher, which is specific to Neo4j. This allows us to implement the architecture of our system over a wide variety of BD-RDFs providing their query languages, without changing any of the other components of the system. Finally, we tested and evaluated our tool using different test bases, and it turned out that our tool is comprehensive, effective, and powerful enough.


A multi-agent ontologies-based clinical decision support system

Shen, Ying, Armelle, Jacquet-Andrieu, Colloc, Joël

arXiv.org Artificial Intelligence

Clinical decision support systems combine knowledge and data from a variety of sources, represented by quantitative models based on stochastic methods, or qualitative based rather on expert heuristics and deductive reasoning. At the same time, case-based reasoning (CBR) memorizes and returns the experience of solving similar problems. The cooperation of heterogeneous clinical knowledge bases (knowledge objects, semantic distances, evaluation functions, logical rules, databases...) is based on medical ontologies. A multi-agent decision support system (MADSS) enables the integration and cooperation of agents specialized in different fields of knowledge (semiology, pharmacology, clinical cases, etc.). Each specialist agent operates a knowledge base defining the conduct to be maintained in conformity with the state of the art associated with an ontological basis that expresses the semantic relationships between the terms of the domain in question. Our approach is based on the specialization of agents adapted to the knowledge models used during the clinical steps and ontologies. This modular approach is suitable for the realization of MADSS in many areas.


SMILK, linking natural language and data from the web

Lopez, Cédric, Dhouib, Molka, Cabrio, Elena, Zucker, Catherine Faron, Gandon, Fabien, Segond, Frédérique

arXiv.org Artificial Intelligence

As part of the SMILK Joint Lab, we studied the use of Natural Language Processing to: (1) enrich knowledge bases and link data on the web, and conversely (2) use this linked data to contribute to the improvement of text analysis and the annotation of textual content, and to support knowledge extraction. The evaluation focused on brand-related information retrieval in the field of cosmetics. This article describes each step of our approach: the creation of ProVoc, an ontology to describe products and brands; the automatic population of a knowledge base mainly based on ProVoc from heterogeneous textual resources; and the evaluation of an application which that takes the form of a browser plugin providing additional knowledge to users browsing the web.


Etude de Mod\`eles \`a base de r\'eseaux Bay\'esiens pour l'aide au diagnostic de tumeurs c\'er\'ebrales

Lamine, Fradj Ben, Kalti, Karim, Mahjoub, Mohamed Ali

arXiv.org Artificial Intelligence

This article describes different models based on Bayesian networks RB modeling expertise in the diagnosis of brain tumors. Indeed, they are well adapted to the representation of the uncertainty in the process of diagnosis of these tumors. In our work, we first tested several structures derived from the Bayesian network reasoning performed by doctors on the one hand and structures generated automatically on the other. This step aims to find the best structure that increases diagnostic accuracy. The machine learning algorithms relate MWST-EM algorithms, SEM and SEM + T. To estimate the parameters of the Bayesian network from a database incomplete, we have proposed an extension of the EM algorithm by adding a priori knowledge in the form of the thresholds calculated by the first phase of the algorithm RBE . The very encouraging results obtained are discussed at the end of the paper