Not enough data to create a plot.
Try a different view from the menu above.
Bertini, Flavio
Data2Concept2Text: An Explainable Multilingual Framework for Data Analysis Narration
Bertini, Flavio, Palù, Alessandro Dal, Zaglio, Federica, Fabiano, Francesco, Formisano, Andrea
This paper presents a complete explainable system that interprets a set of data, abstracts the underlying features and describes them in a natural language of choice. The system relies on two crucial stages: (i) identifying emerging properties from data and transforming them into abstract concepts, and (ii) converting these concepts into natural language. Despite the impressive natural language generation capabilities demonstrated by Large Language Models, their statistical nature and the intricacy of their internal mechanism still force us to employ these techniques as black boxes, forgoing trustworthiness. Developing an explainable pipeline for data interpretation would allow facilitating its use in safety-critical environments like processing medical information and allowing non-experts and visually impaired people to access narrated information. To this end, we believe that the fields of knowledge representation and automated reasoning research could present a valid alternative. Expanding on prior research that tackled the first stage (i), we focus on the second stage, named Concept2Text. Being explainable, data translation is easily modeled through logic-based rules, once again emphasizing the role of declarative programming in achieving AI explainability. This paper explores a Prolog/CLP-based rewriting system to interpret concepts-articulated in terms of classes and relations, plus common knowledge-derived from a generic ontology, generating natural language text. Its main features include hierarchical tree rewritings, modular multilingual generation, support for equivalent variants across semantic, grammar, and lexical levels, and a transparent rule-based system. We outline the architecture and demonstrate its flexibility through some examples capable of generating numerous diverse and equivalent rewritings based on the input concept.
Survey on Abstractive Text Summarization: Dataset, Models, and Metrics
Nnadi, Gospel Ozioma, Bertini, Flavio
Readers and scholars often desire a concise summary (Too Long; Didn't Read - TL;DR) of texts to effectively prioritize information. However, creating document summaries is mentally taxing and time-consuming, especially considering the overwhelming volume of documents produced annually, as depicted in Figure 1 by [2], Figure 2, [3] reported over 100,000 scientific articles on the Corona virus pandemic in 2020, though these articles contain brief abstracts of the article, the sheer volume poses challenges for researchers and medical professionals in quickly extracting relevant knowledge on a specific topic. An automatically generated multi-document summarization could be valuable, providing readers with essential information and reducing the need to access original files unless refinement is necessary. Text summarization has garnered significant research attention, proving useful in search engines, news clustering, timeline generation, and various other applications. The objective of text summarization is to create a brief, coherent, factually consistent, and readable document that retains the essential information from the source document, whether it is a single or multi-document. In Single Document Summarization (SDS) only one input document is used, eliminating the need for additional processing to assess relationships between inputs. This method is suitable for summarizing standalone documents such as emails, legal contracts, financial reports and so on. The primary goal of Multi Document Summarization (MDS) is to gather information from several texts addressing the same topic, often composed at different times or representing diverse perspectives. The overarching objective is to produce information reports that are both succinct and comprehensive, consolidating varied opinions from documents that explore a topic through multiple viewpoints.