Goto

Collaborating Authors

 Cremonesi, Francesco


A cautionary tale on the cost-effectiveness of collaborative AI in real-world medical applications

arXiv.org Artificial Intelligence

Background. Federated learning (FL) has gained wide popularity as a collaborative learning paradigm enabling collaborative AI in sensitive healthcare applications. Nevertheless, the practical implementation of FL presents technical and organizational challenges, as it generally requires complex communication infrastructures. In this context, consensus-based learning (CBL) may represent a promising collaborative learning alternative, thanks to the ability of combining local knowledge into a federated decision system, while potentially reducing deployment overhead. Methods. In this work we propose an extensive benchmark of the accuracy and cost-effectiveness of a panel of FL and CBL methods in a wide range of collaborative medical data analysis scenarios. The benchmark includes 7 different medical datasets, encompassing 3 machine learning tasks, 8 different data modalities, and multi-centric settings involving 3 to 23 clients. Findings. Our results reveal that CBL is a cost-effective alternative to FL. When compared across the panel of medical dataset in the considered benchmark, CBL methods provide equivalent accuracy to the one achieved by FL.Nonetheless, CBL significantly reduces training time and communication cost (resp. 15 fold and 60 fold decrease) (p < 0.05). Interpretation. This study opens a novel perspective on the deployment of collaborative AI in real-world applications, whereas the adoption of cost-effective methods is instrumental to achieve sustainability and democratisation of AI by alleviating the need for extensive computational resources.


Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation

arXiv.org Artificial Intelligence

Healthcare data is often split into medium/small-sized collections across multiple hospitals and access to it is encumbered by privacy regulations. This brings difficulties to use them for the development of machine learning and deep learning models, which are known to be data-hungry. One way to overcome this limitation is to use collaborative learning (CL) methods, which allow hospitals to work collaboratively to solve a task, without the need to explicitly share local data. In this paper, we address a prostate segmentation problem from MRI in a collaborative scenario by comparing two different approaches: federated learning (FL) and consensus-based methods (CBM). To the best of our knowledge, this is the first work in which CBM, such as label fusion techniques, are used to solve a problem of collaborative learning. In this setting, CBM combine predictions from locally trained models to obtain a federated strong learner with ideally improved robustness and predictive variance properties. Our experiments show that, in the considered practical scenario, CBMs provide equal or better results than FL, while being highly cost-effective. Our results demonstrate that the consensus paradigm may represent a valid alternative to FL for typical training tasks in medical imaging.


Fed-BioMed: Open, Transparent and Trusted Federated Learning for Real-world Healthcare Applications

arXiv.org Artificial Intelligence

The need for large amounts of data to develop Artificial Intelligence (AI) in healthcare has motivated a number of national and international initiatives aimed at creating medical data lakes accessible to researchers, such as the French Health Data Hub [10], the UK BioBank [59], the US ADNI [26] and TCGA [60], among the many [58, 40, 7]. In spite of these initiatives, there are still major bottlenecks preventing the widespread availability of large centralized repositories of healthcare information [63]. To overcome these limitations, Federated Learning (FL) has been proposed as a working paradigm to enable the training of ML models on large datasets from diverse sources while guaranteeing the respect of data privacy and governance. The basic paradigm of FL consists of iterating the following steps: i) model training is performed locally in the hospitals starting from a common initialization, ii) the resulting model parameters are subsequently shared (instead of the data) and aggregated, to define a global model iii) transmitted back to the hospitals to initiate a new local training step. Under certain conditions [39], this procedure is guaranteed to converge to a final global model representing an optimal consensus among the hospitals participating in the experiment. FL is particularly suited for applications in sensitive domains, such as healthcare and biomedical research [48, 9, 13].


Fed-MIWAE: Federated Imputation of Incomplete Data via Deep Generative Models

arXiv.org Artificial Intelligence

Federated learning allows for the training of machine learning models on multiple decentralized local datasets without requiring explicit data exchange. However, data pre-processing, including strategies for handling missing data, remains a major bottleneck in real-world federated learning deployment, and is typically performed locally. This approach may be biased, since the subpopulations locally observed at each center may not be representative of the overall one. To address this issue, this paper first proposes a more consistent approach to data standardization through a federated model. Additionally, we propose Fed-MIWAE, a federated version of the state-of-the-art imputation method MIWAE, a deep latent variable model for missing data imputation based on variational autoencoders. MIWAE has the great advantage of being easily trainable with classical federated aggregators. Furthermore, it is able to deal with MAR (Missing At Random) data, a more challenging missing-data mechanism than MCAR (Missing Completely At Random), where the missingness of a variable can depend on the observed ones. We evaluate our method on multi-modal medical imaging data and clinical scores from a simulated federated scenario with the ADNI dataset. We compare Fed-MIWAE with respect to classical imputation methods, either performed locally or in a centralized fashion. Fed-MIWAE allows to achieve imputation accuracy comparable with the best centralized method, even when local data distributions are highly heterogeneous. In addition, thanks to the variational nature of Fed-MIWAE, our method is designed to perform multiple imputation, allowing for the quantification of the imputation uncertainty in the federated scenario.