Weicken, Eva
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Lekadir, Karim, Feragen, Aasa, Fofanah, Abdul Joseph, Frangi, Alejandro F, Buyx, Alena, Emelie, Anais, Lara, Andrea, Porras, Antonio R, Chan, An-Wen, Navarro, Arcadi, Glocker, Ben, Botwe, Benard O, Khanal, Bishesh, Beger, Brigit, Wu, Carol C, Cintas, Celia, Langlotz, Curtis P, Rueckert, Daniel, Mzurikwao, Deogratias, Fotiadis, Dimitrios I, Zhussupov, Doszhan, Ferrante, Enzo, Meijering, Erik, Weicken, Eva, González, Fabio A, Asselbergs, Folkert W, Prior, Fred, Krestin, Gabriel P, Collins, Gary, Tegenaw, Geletaw S, Kaissis, Georgios, Misuraca, Gianluca, Tsakou, Gianna, Dwivedi, Girish, Kondylakis, Haridimos, Jayakody, Harsha, Woodruf, Henry C, Aerts, Hugo JWL, Walsh, Ian, Chouvarda, Ioanna, Buvat, Irène, Rekik, Islem, Duncan, James, Kalpathy-Cramer, Jayashree, Zahir, Jihad, Park, Jinah, Mongan, John, Gichoya, Judy W, Schnabel, Julia A, Kushibar, Kaisar, Riklund, Katrine, Mori, Kensaku, Marias, Kostas, Amugongo, Lameck M, Fromont, Lauren A, Maier-Hein, Lena, Alberich, Leonor Cerdá, Rittner, Leticia, Phiri, Lighton, Marrakchi-Kacem, Linda, Donoso-Bach, Lluís, Martí-Bonmatí, Luis, Cardoso, M Jorge, Bobowicz, Maciej, Shabani, Mahsa, Tsiknakis, Manolis, Zuluaga, Maria A, Bielikova, Maria, Fritzsche, Marie-Christine, Linguraru, Marius George, Wenzel, Markus, De Bruijne, Marleen, Tolsgaard, Martin G, Ghassemi, Marzyeh, Ashrafuzzaman, Md, Goisauf, Melanie, Yaqub, Mohammad, Ammar, Mohammed, Abadía, Mónica Cano, Mahmoud, Mukhtar M E, Elattar, Mustafa, Rieke, Nicola, Papanikolaou, Nikolaos, Lazrak, Noussair, Díaz, Oliver, Salvado, Olivier, Pujol, Oriol, Sall, Ousmane, Guevara, Pamela, Gordebeke, Peter, Lambin, Philippe, Brown, Pieta, Abolmaesumi, Purang, Dou, Qi, Lu, Qinghua, Osuala, Richard, Nakasi, Rose, Zhou, S Kevin, Napel, Sandy, Colantonio, Sara, Albarqouni, Shadi, Joshi, Smriti, Carter, Stacy, Klein, Stefan, Petersen, Steffen E, Aussó, Susanna, Awate, Suyash, Raviv, Tammy Riklin, Cook, Tessa, Mutsvangwa, Tinashe E M, Rogers, Wendy A, Niessen, Wiro J, Puig-Bosch, Xènia, Zeng, Yi, Mohammed, Yunusa G, Aquino, Yves Saint James, Salahuddin, Zohaib, Starmans, Martijn P A
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.
From Modern CNNs to Vision Transformers: Assessing the Performance, Robustness, and Classification Strategies of Deep Learning Models in Histopathology
Springenberg, Maximilian, Frommholz, Annika, Wenzel, Markus, Weicken, Eva, Ma, Jackie, Strodthoff, Nils
While machine learning is currently transforming the field of histopathology, the domain lacks a comprehensive evaluation of state-of-the-art models based on essential but complementary quality requirements beyond a mere classification accuracy. In order to fill this gap, we developed a new methodology to extensively evaluate a wide range of classification models, including recent vision transformers, and convolutional neural networks such as: ConvNeXt, ResNet (BiT), Inception, ViT and Swin transformer, with and without supervised or self-supervised pretraining. We thoroughly tested the models on five widely used histopathology datasets containing whole slide images of breast, gastric, and colorectal cancer and developed a novel approach using an image-to-image translation model to assess the robustness of a cancer classification model against stain variations. Further, we extended existing interpretability methods to previously unstudied models and systematically reveal insights of the models' classifications strategies that can be transferred to future model architectures.