Goto

Collaborating Authors

 Parimbelli, Enea


Igea: a Decoder-Only Language Model for Biomedical Text Generation in Italian

arXiv.org Artificial Intelligence

The advent of probabilistic language models has revolutionized various domains, with biomedical natural language processing (NLP) standing out due to its significant impact on healthcare provision and medical research. The ability of these models to understand, process, and generate text from vast biomedical corpora has led to improvements in tasks such as entity recognition, relation extraction, and question answering. However, the majority of this progress has been focused on English-language texts, creating a notable disparity for other languages with fewer resources, such as Italian. In the Italian context, the scarcity of large and diverse training datasets presents a substantial challenge. General language models like Minerva and Maestrale have made strides in Italian NLP, but they lack the specialization required to handle the nuances of biomedical terminology effectively. Addressing this gap is crucial, as the precision and clarity needed in medical communications are paramount for clinical and research applications in such a high-stakes domain. In this paper we introduce Igea, a biomedical language model (BLM) built from the ground-up on the Italian language, and that is effective in handling Italian native biomedical text while maintaining its efficiency in terms of computational resources. We built upon the foundation model Minerva, which we then continually trained on Italian native biomedical text, while employing proper provisions to avoid disruption of what was learned during pre-training.


Evaluation of Predictive Reliability to Foster Trust in Artificial Intelligence. A case study in Multiple Sclerosis

arXiv.org Artificial Intelligence

Applying Artificial Intelligence (AI) and Machine Learning (ML) in critical contexts, such as medicine, requires the implementation of safety measures to reduce risks of harm in case of prediction errors. Spotting ML failures is of paramount importance when ML predictions are used to drive clinical decisions. ML predictive reliability measures the degree of trust of a ML prediction on a new instance, thus allowing decision-makers to accept or reject it based on its reliability. To assess reliability, we propose a method that implements two principles. First, our approach evaluates whether an instance to be classified is coming from the same distribution of the training set. To do this, we leverage Autoencoders (AEs) ability to reconstruct the training set with low error. An instance is considered Out-of-Distribution (OOD) if the AE reconstructs it with a high error. Second, it is evaluated whether the ML classifier has good performances on samples similar to the newly classified instance by using a proxy model. We show that this approach is able to assess reliability both in a simulated scenario and on a model trained to predict disease progression of Multiple Sclerosis patients. We also developed a Python package, named relAI, to embed reliability measures into ML pipelines. We propose a simple approach that can be used in the deployment phase of any ML model to suggest whether to trust predictions or not. Our method holds the promise to provide effective support to clinicians by spotting potential ML failures during deployment.


Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models

arXiv.org Artificial Intelligence

In the era of digital healthcare, the huge volumes of textual information generated every day in hospitals constitute an essential but underused asset that could be exploited with task-specific, fine-tuned biomedical language representation models, improving patient care and management. For such specialized domains, previous research has shown that fine-tuning models stemming from broad-coverage checkpoints can largely benefit additional training rounds over large-scale in-domain resources. However, these resources are often unreachable for less-resourced languages like Italian, preventing local medical institutions to employ in-domain adaptation. In order to reduce this gap, our work investigates two accessible approaches to derive biomedical language models in languages other than English, taking Italian as a concrete use-case: one based on neural machine translation of English resources, favoring quantity over quality; the other based on a high-grade, narrow-scoped corpus natively written in Italian, thus preferring quality over quantity. Our study shows that data quantity is a harder constraint than data quality for biomedical adaptation, but the concatenation of high-quality data can improve model performance even when dealing with relatively size-limited corpora. The models published from our investigations have the potential to unlock important research opportunities for Italian hospitals and academia. Finally, the set of lessons learned from the study constitutes valuable insights towards a solution to build biomedical language models that are generalizable to other less-resourced languages and different domain settings.


Advancing Italian Biomedical Information Extraction with Large Language Models: Methodological Insights and Multicenter Practical Application

arXiv.org Artificial Intelligence

The introduction of computerized medical records in hospitals has reduced burdensome operations like manual writing and information fetching. However, the data contained in medical records are still far underutilized, primarily because extracting them from unstructured textual medical records takes time and effort. Information Extraction, a subfield of Natural Language Processing, can help clinical practitioners overcome this limitation, using automated text-mining pipelines. In this work, we created the first Italian neuropsychiatric Named Entity Recognition dataset, PsyNIT, and used it to develop a Large Language Model for this task. Moreover, we conducted several experiments with three external independent datasets to implement an effective multicenter model, with overall F1-score 84.77%, Precision 83.16%, Recall 86.44%. The lessons learned are: (i) the crucial role of a consistent annotation process and (ii) a fine-tuning strategy that combines classical methods with a "few-shot" approach. This allowed us to establish methodological guidelines that pave the way for future implementations in this field and allow Italian hospitals to tap into important research opportunities.


Tree-based local explanations of machine learning model predictions, AraucanaXAI

arXiv.org Artificial Intelligence

Increasingly complex learning methods such as boosting, bagging and deep learning have made ML models more accurate, but harder to understand and interpret. A tradeoff between between performance and intelligibility is often to be faced, especially in high-stakes applications like medicine. In the present article we propose a novel methodological approach for generating explanations of the predictions of a generic ML model, given a specific instance for which the prediction has been made, that can tackle both classification and regression tasks. Advantages of the proposed XAI approach include improved fidelity to the original model, ability to deal with non-linear decision boundaries, and native support to both classification and regression problems.


Integrating Environmental Data, Citizen Science and Personalized Predictive Modeling to Support Public Health in Cities: The PULSE WebGIS

AAAI Conferences

The percentage of the world’s population living in urban areas is projected to increase significantly in the next decades. This makes the urban environment the perfect bench for research aiming to manage and respond to dramatic demographic and epidemiological transitions. In this context the PULSE project has partnered with five global cities to transform public health from a reactive to a predictive system focused on both risk and resilience. PULSE aims at producing an integrated data ecosystem based on continuous large-scale collection of information available within the smart city environment. The integration of environmental data, citizen science and location-specific predictive modeling of disease onset allows for richer analytics that promote informed, data-driven health policy decisions. In this paper we describe the PULSE ecosystem, with a special focus on its WebGIS component and its prototype version based on New York city data.


Use of Patient Generated Data from Social Media and Collaborative Filtering for Preferences Elicitation in Shared Decision Making

AAAI Conferences

With the increasing demand for personalization in clinical decision support system, one of the most challenging tasks is effective patient preferences elicitation. In the context of the MobiGuide project, within a medical application related to atrial fibrillation, a decision support system has been developed for both doctors and patients. In particular, we support shared decision-making, by integrating decision tree models with a dedicated tool for utility coefficients elicitation. In this paper we focus on the decision problem regarding the choice of anticoagulant therapy for low risk non-valvular atrial fibrillation patients. In addition to the traditional methods, such as time trade-off and standard gamble, an alternative way for preferences elicitation is proposed, exploiting patients’ self-reported data in health-related social media as the main source of information.