AITopics | Parimbelli, Enea

Collaborating Authors

Parimbelli, Enea

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Igea: a Decoder-Only Language Model for Biomedical Text Generation in Italian

Buonocore, Tommaso Mario, Rancati, Simone, Parimbelli, Enea

arXiv.org Artificial IntelligenceJul-8-2024

The advent of probabilistic language models has revolutionized various domains, with biomedical natural language processing (NLP) standing out due to its significant impact on healthcare provision and medical research. The ability of these models to understand, process, and generate text from vast biomedical corpora has led to improvements in tasks such as entity recognition, relation extraction, and question answering. However, the majority of this progress has been focused on English-language texts, creating a notable disparity for other languages with fewer resources, such as Italian. In the Italian context, the scarcity of large and diverse training datasets presents a substantial challenge. General language models like Minerva and Maestrale have made strides in Italian NLP, but they lack the specialization required to handle the nuances of biomedical terminology effectively. Addressing this gap is crucial, as the precision and clarity needed in medical communications are paramount for clinical and research applications in such a high-stakes domain. In this paper we introduce Igea, a biomedical language model (BLM) built from the ground-up on the Italian language, and that is effective in handling Italian native biomedical text while maintaining its efficiency in terms of computational resources. We built upon the foundation model Minerva, which we then continually trained on Italian native biomedical text, while employing proper provisions to avoid disruption of what was learned during pre-training.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2407.06011

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Evaluation of Predictive Reliability to Foster Trust in Artificial Intelligence. A case study in Multiple Sclerosis

Peracchio, Lorenzo, Nicora, Giovanna, Parimbelli, Enea, Buonocore, Tommaso Mario, Bergamaschi, Roberto, Tavazzi, Eleonora, Dagliati, Arianna, Bellazzi, Riccardo

arXiv.org Artificial IntelligenceFeb-27-2024

Applying Artificial Intelligence (AI) and Machine Learning (ML) in critical contexts, such as medicine, requires the implementation of safety measures to reduce risks of harm in case of prediction errors. Spotting ML failures is of paramount importance when ML predictions are used to drive clinical decisions. ML predictive reliability measures the degree of trust of a ML prediction on a new instance, thus allowing decision-makers to accept or reject it based on its reliability. To assess reliability, we propose a method that implements two principles. First, our approach evaluates whether an instance to be classified is coming from the same distribution of the training set. To do this, we leverage Autoencoders (AEs) ability to reconstruct the training set with low error. An instance is considered Out-of-Distribution (OOD) if the AE reconstructs it with a high error. Second, it is evaluated whether the ML classifier has good performances on samples similar to the newly classified instance by using a proxy model. We show that this approach is able to assess reliability both in a simulated scenario and on a model trained to predict disease progression of Multiple Sclerosis patients. We also developed a Python package, named relAI, to embed reliability measures into ML pipelines. We propose a simple approach that can be used in the deployment phase of any ML model to suggest whether to trust predictions or not. Our method holds the promise to provide effective support to clinicians by spotting potential ML failures during deployment.

artificial intelligence, machine learning, prediction, (14 more...)

arXiv.org Artificial Intelligence

2402.17554

Country: North America > United States (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Localising In-Domain Adaptation of Transformer-Based Biomedical Language Models

Buonocore, Tommaso Mario, Crema, Claudio, Redolfi, Alberto, Bellazzi, Riccardo, Parimbelli, Enea

arXiv.org Artificial IntelligenceJun-28-2023

In the era of digital healthcare, the huge volumes of textual information generated every day in hospitals constitute an essential but underused asset that could be exploited with task-specific, fine-tuned biomedical language representation models, improving patient care and management. For such specialized domains, previous research has shown that fine-tuning models stemming from broad-coverage checkpoints can largely benefit additional training rounds over large-scale in-domain resources. However, these resources are often unreachable for less-resourced languages like Italian, preventing local medical institutions to employ in-domain adaptation. In order to reduce this gap, our work investigates two accessible approaches to derive biomedical language models in languages other than English, taking Italian as a concrete use-case: one based on neural machine translation of English resources, favoring quantity over quality; the other based on a high-grade, narrow-scoped corpus natively written in Italian, thus preferring quality over quantity. Our study shows that data quantity is a harder constraint than data quality for biomedical adaptation, but the concatenation of high-quality data can improve model performance even when dealing with relatively size-limited corpora. The models published from our investigations have the potential to unlock important research opportunities for Italian hospitals and academia. Finally, the set of lessons learned from the study constitutes valuable insights towards a solution to build biomedical language models that are generalizable to other less-resourced languages and different domain settings.

corpus, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jbi.2023.104431

2212.10422

Country:

Europe (0.93)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)
Health & Medicine > Health Care Technology > Telehealth (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Advancing Italian Biomedical Information Extraction with Large Language Models: Methodological Insights and Multicenter Practical Application

Crema, Claudio, Buonocore, Tommaso Mario, Fostinelli, Silvia, Parimbelli, Enea, Verde, Federico, Fundarò, Cira, Manera, Marina, Ramusino, Matteo Cotta, Capelli, Marco, Costa, Alfredo, Binetti, Giuliano, Bellazzi, Riccardo, Redolfi, Alberto

arXiv.org Artificial IntelligenceJun-8-2023

The introduction of computerized medical records in hospitals has reduced burdensome operations like manual writing and information fetching. However, the data contained in medical records are still far underutilized, primarily because extracting them from unstructured textual medical records takes time and effort. Information Extraction, a subfield of Natural Language Processing, can help clinical practitioners overcome this limitation, using automated text-mining pipelines. In this work, we created the first Italian neuropsychiatric Named Entity Recognition dataset, PsyNIT, and used it to develop a Large Language Model for this task. Moreover, we conducted several experiments with three external independent datasets to implement an effective multicenter model, with overall F1-score 84.77%, Precision 83.16%, Recall 86.44%. The lessons learned are: (i) the crucial role of a consistent annotation process and (ii) a fine-tuning strategy that combines classical methods with a "few-shot" approach. This allowed us to establish methodological guidelines that pave the way for future implementations in this field and allow Italian hospitals to tap into important research opportunities.

data mining, information retrieval, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2306.05323

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Data Science > Data Mining > Text Mining (0.90)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Tree-based local explanations of machine learning model predictions, AraucanaXAI

Parimbelli, Enea, Nicora, Giovanna, Wilk, Szymon, Michalowski, Wojtek, Bellazzi, Riccardo

arXiv.org Artificial IntelligenceOct-15-2021

Increasingly complex learning methods such as boosting, bagging and deep learning have made ML models more accurate, but harder to understand and interpret. A tradeoff between between performance and intelligibility is often to be faced, especially in high-stakes applications like medicine. In the present article we propose a novel methodological approach for generating explanations of the predictions of a generic ML model, given a specific instance for which the prediction has been made, that can tackle both classification and regression tasks. Advantages of the proposed XAI approach include improved fidelity to the original model, ability to deal with non-linear decision boundaries, and native support to both classification and regression problems.

artificial intelligence, explanation, machine learning, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.artmed.2022.102471

2110.08272

Country: North America > United States (0.69)

Genre: Research Report > Promising Solution (0.35)

Industry:

Health & Medicine (1.00)
Law (0.69)
Information Technology > Security & Privacy (0.47)
Government > Regional Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.96)

Add feedback

Integrating Environmental Data, Citizen Science and Personalized Predictive Modeling to Support Public Health in Cities: The PULSE WebGIS

Parimbelli, Enea (University of Ottawa) | Pala, Daniele (University of Pavia) | Bellazzi, Riccardo (University of Pavia) | Vera-Munoz, Cecilia (Universidad Politecnica de Madrid) | Casella, Vittorio (University of Pavia)

AAAI ConferencesApr-6-2018

The percentage of the world’s population living in urban areas is projected to increase significantly in the next decades. This makes the urban environment the perfect bench for research aiming to manage and respond to dramatic demographic and epidemiological transitions. In this context the PULSE project has partnered with five global cities to transform public health from a reactive to a predictive system focused on both risk and resilience. PULSE aims at producing an integrated data ecosystem based on continuous large-scale collection of information available within the smart city environment. The integration of environmental data, citizen science and location-specific predictive modeling of disease onset allows for richer analytics that promote informed, data-driven health policy decisions. In this paper we describe the PULSE ecosystem, with a special focus on its WebGIS component and its prototype version based on New York city data.

citizen science, integrating environmental data

AAAI Conferences

Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > New York (0.24)

Industry: Health & Medicine > Public Health (0.89)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

Use of Patient Generated Data from Social Media and Collaborative Filtering for Preferences Elicitation in Shared Decision Making

Parimbelli, Enea (University of Pavia) | Quaglini, Silvana (University of Pavia) | Napolitano, Carlo (IRCCS Fondazione Salvatore Maugeri) | Priori, Silvia (IRCCS Fondazione Salvatore Maugeri) | Bellazzi, Riccardo (University of Pavia, IRCCS Fondazione Salvatore Maugeri) | Holmes, John (University of Pennsylvania)

AAAI ConferencesNov-1-2014

With the increasing demand for personalization in clinical decision support system, one of the most challenging tasks is effective patient preferences elicitation. In the context of the MobiGuide project, within a medical application related to atrial fibrillation, a decision support system has been developed for both doctors and patients. In particular, we support shared decision-making, by integrating decision tree models with a dedicated tool for utility coefficients elicitation. In this paper we focus on the decision problem regarding the choice of anticoagulant therapy for low risk non-valvular atrial fibrillation patients. In addition to the traditional methods, such as time trade-off and standard gamble, an alternative way for preferences elicitation is proposed, exploiting patients’ self-reported data in health-related social media as the main source of information.

decision tree learning, elicitation, vascular disease, (17 more...)

AAAI Conferences

2014 AAAI Fall Symposium Series

Country:

Europe (0.29)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.88)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.88)

Add feedback