Nichyporuk, Brennan
Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free
Favero, Gian Mario, Saremi, Parham, Kaczmarek, Emily, Nichyporuk, Brennan, Arbel, Tal
Discriminative classifiers have become a foundational tool in deep learning for medical imaging, excelling at learning separable features of complex data distributions. However, these models often need careful design, augmentation, and training techniques to ensure safe and reliable deployment. Recently, diffusion models have become synonymous with generative modeling in 2D. These models showcase robustness across a range of tasks including natural image classification, where classification is performed by comparing reconstruction errors across images generated for each possible conditioning input. This work presents the first exploration of the potential of class conditional diffusion models for 2D medical image classification. First, we develop a novel majority voting scheme shown to improve the performance of medical diffusion classifiers. Next, extensive experiments on the CheXpert and ISIC Melanoma skin cancer datasets demonstrate that foundation and trained-from-scratch diffusion models achieve competitive performance against SOTA discriminative classifiers without the need for explicit supervision. In addition, we show that diffusion classifiers are intrinsically explainable, and can be used to quantify the uncertainty of their predictions, increasing their trustworthiness and reliability in safety-critical, clinical contexts. Further information is available on our project page: https://faverogian.github.io/med-diffusion-classifier.github.io/
HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling
Duenias, Daniel, Nichyporuk, Brennan, Arbel, Tal, Raviv, Tammy Riklin
The integration of diverse clinical modalities such as medical imaging and the tabular data obtained by the patients' Electronic Health Records (EHRs) is a crucial aspect of modern healthcare. The integrative analysis of multiple sources can provide a comprehensive understanding of a patient's condition and can enhance diagnoses and treatment decisions. Deep Neural Networks (DNNs) consistently showcase outstanding performance in a wide range of multimodal tasks in the medical domain. However, the complex endeavor of effectively merging medical imaging with clinical, demographic and genetic information represented as numerical tabular data remains a highly active and ongoing research pursuit. We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements. This approach aims to leverage the complementary information present in these modalities to enhance the accuracy of various medical applications. We demonstrate the strength and the generality of our method on two different brain Magnetic Resonance Imaging (MRI) analysis tasks, namely, brain age prediction conditioned by subject's sex, and multiclass Alzheimer's Disease (AD) classification conditioned by tabular data. We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods. The code, enclosed to this manuscript will be made publicly available.
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles
Shen, Xing, Huang, Hengguan, Nichyporuk, Brennan, Arbel, Tal
While deep learning models have achieved remarkable success across a range of medical image analysis tasks, deployment of these models in real clinical contexts requires that they be robust to variability in the acquired images. While many methods apply predefined transformations to augment the training data to enhance test-time robustness, these transformations may not ensure the model's robustness to the diverse variability seen in patient images. In this paper, we introduce a novel three-stage approach based on transformers coupled with conditional diffusion models, with the goal of improving model robustness to the kinds of imaging variability commonly encountered in practice without the need for pre-determined data augmentation strategies. To this end, multiple image encoders first learn hierarchical feature representations to build discriminative latent spaces. Next, a reverse diffusion process, guided by the latent code, acts on an informative prior and proposes prediction candidates in a generative manner. Finally, several prediction candidates are aggregated in a bi-level aggregation protocol to produce the final output. Through extensive experiments on medical imaging benchmark datasets, we show that our method improves upon state-of-the-art methods in terms of robustness and confidence calibration. Additionally, we introduce a strategy to quantify the prediction uncertainty at the instance level, increasing their trustworthiness to clinicians using them in clinical practice.
Biomedical image analysis competitions: The state of current participation practice
Eisenmann, Matthias, Reinke, Annika, Weru, Vivienn, Tizabi, Minu Dietlinde, Isensee, Fabian, Adler, Tim J., Godau, Patrick, Cheplygina, Veronika, Kozubek, Michal, Ali, Sharib, Gupta, Anubha, Kybic, Jan, Noble, Alison, de Solórzano, Carlos Ortiz, Pachade, Samiksha, Petitjean, Caroline, Sage, Daniel, Wei, Donglai, Wilden, Elizabeth, Alapatt, Deepak, Andrearczyk, Vincent, Baid, Ujjwal, Bakas, Spyridon, Balu, Niranjan, Bano, Sophia, Bawa, Vivek Singh, Bernal, Jorge, Bodenstedt, Sebastian, Casella, Alessandro, Choi, Jinwook, Commowick, Olivier, Daum, Marie, Depeursinge, Adrien, Dorent, Reuben, Egger, Jan, Eichhorn, Hannah, Engelhardt, Sandy, Ganz, Melanie, Girard, Gabriel, Hansen, Lasse, Heinrich, Mattias, Heller, Nicholas, Hering, Alessa, Huaulmé, Arnaud, Kim, Hyunjeong, Landman, Bennett, Li, Hongwei Bran, Li, Jianning, Ma, Jun, Martel, Anne, Martín-Isla, Carlos, Menze, Bjoern, Nwoye, Chinedu Innocent, Oreiller, Valentin, Padoy, Nicolas, Pati, Sarthak, Payette, Kelly, Sudre, Carole, van Wijnen, Kimberlin, Vardazaryan, Armine, Vercauteren, Tom, Wagner, Martin, Wang, Chuanbo, Yap, Moi Hoon, Yu, Zeyun, Yuan, Chun, Zenk, Maximilian, Zia, Aneeq, Zimmerer, David, Bao, Rina, Choi, Chanyeol, Cohen, Andrew, Dzyubachyk, Oleh, Galdran, Adrian, Gan, Tianyuan, Guo, Tianqi, Gupta, Pradyumna, Haithami, Mahmood, Ho, Edward, Jang, Ikbeom, Li, Zhili, Luo, Zhengbo, Lux, Filip, Makrogiannis, Sokratis, Müller, Dominik, Oh, Young-tack, Pang, Subeen, Pape, Constantin, Polat, Gorkem, Reed, Charlotte Rosalie, Ryu, Kanghyun, Scherr, Tim, Thambawita, Vajira, Wang, Haoyu, Wang, Xinliang, Xu, Kele, Yeh, Hung, Yeo, Doyeob, Yuan, Yixuan, Zeng, Yan, Zhao, Xin, Abbing, Julian, Adam, Jannes, Adluru, Nagesh, Agethen, Niklas, Ahmed, Salman, Khalil, Yasmina Al, Alenyà, Mireia, Alhoniemi, Esa, An, Chengyang, Anwar, Talha, Arega, Tewodros Weldebirhan, Avisdris, Netanell, Aydogan, Dogu Baran, Bai, Yingbin, Calisto, Maria Baldeon, Basaran, Berke Doga, Beetz, Marcel, Bian, Cheng, Bian, Hao, Blansit, Kevin, Bloch, Louise, Bohnsack, Robert, Bosticardo, Sara, Breen, Jack, Brudfors, Mikael, Brüngel, Raphael, Cabezas, Mariano, Cacciola, Alberto, Chen, Zhiwei, Chen, Yucong, Chen, Daniel Tianming, Cho, Minjeong, Choi, Min-Kook, Xie, Chuantao Xie Chuantao, Cobzas, Dana, Cohen-Adad, Julien, Acero, Jorge Corral, Das, Sujit Kumar, de Oliveira, Marcela, Deng, Hanqiu, Dong, Guiming, Doorenbos, Lars, Efird, Cory, Escalera, Sergio, Fan, Di, Serj, Mehdi Fatan, Fenneteau, Alexandre, Fidon, Lucas, Filipiak, Patryk, Finzel, René, Freitas, Nuno R., Friedrich, Christoph M., Fulton, Mitchell, Gaida, Finn, Galati, Francesco, Galazis, Christoforos, Gan, Chang Hee, Gao, Zheyao, Gao, Shengbo, Gazda, Matej, Gerats, Beerend, Getty, Neil, Gibicar, Adam, Gifford, Ryan, Gohil, Sajan, Grammatikopoulou, Maria, Grzech, Daniel, Güley, Orhun, Günnemann, Timo, Guo, Chunxu, Guy, Sylvain, Ha, Heonjin, Han, Luyi, Han, Il Song, Hatamizadeh, Ali, He, Tian, Heo, Jimin, Hitziger, Sebastian, Hong, SeulGi, Hong, SeungBum, Huang, Rian, Huang, Ziyan, Huellebrand, Markus, Huschauer, Stephan, Hussain, Mustaffa, Inubushi, Tomoo, Polat, Ece Isik, Jafaritadi, Mojtaba, Jeong, SeongHun, Jian, Bailiang, Jiang, Yuanhong, Jiang, Zhifan, Jin, Yueming, Joshi, Smriti, Kadkhodamohammadi, Abdolrahim, Kamraoui, Reda Abdellah, Kang, Inha, Kang, Junghwa, Karimi, Davood, Khademi, April, Khan, Muhammad Irfan, Khan, Suleiman A., Khantwal, Rishab, Kim, Kwang-Ju, Kline, Timothy, Kondo, Satoshi, Kontio, Elina, Krenzer, Adrian, Kroviakov, Artem, Kuijf, Hugo, Kumar, Satyadwyoom, La Rosa, Francesco, Lad, Abhi, Lee, Doohee, Lee, Minho, Lena, Chiara, Li, Hao, Li, Ling, Li, Xingyu, Liao, Fuyuan, Liao, KuanLun, Oliveira, Arlindo Limede, Lin, Chaonan, Lin, Shan, Linardos, Akis, Linguraru, Marius George, Liu, Han, Liu, Tao, Liu, Di, Liu, Yanling, Lourenço-Silva, João, Lu, Jingpei, Lu, Jiangshan, Luengo, Imanol, Lund, Christina B., Luu, Huan Minh, Lv, Yi, Lv, Yi, Macar, Uzay, Maechler, Leon, L., Sina Mansour, Marshall, Kenji, Mazher, Moona, McKinley, Richard, Medela, Alfonso, Meissen, Felix, Meng, Mingyuan, Miller, Dylan, Mirjahanmardi, Seyed Hossein, Mishra, Arnab, Mitha, Samir, Mohy-ud-Din, Hassan, Mok, Tony Chi Wing, Murugesan, Gowtham Krishnan, Karthik, Enamundram Naga, Nalawade, Sahil, Nalepa, Jakub, Naser, Mohamed, Nateghi, Ramin, Naveed, Hammad, Nguyen, Quang-Minh, Quoc, Cuong Nguyen, Nichyporuk, Brennan, Oliveira, Bruno, Owen, David, Pal, Jimut Bahan, Pan, Junwen, Pan, Wentao, Pang, Winnie, Park, Bogyu, Pawar, Vivek, Pawar, Kamlesh, Peven, Michael, Philipp, Lena, Pieciak, Tomasz, Plotka, Szymon, Plutat, Marcel, Pourakpour, Fattaneh, Preložnik, Domen, Punithakumar, Kumaradevan, Qayyum, Abdul, Queirós, Sandro, Rahmim, Arman, Razavi, Salar, Ren, Jintao, Rezaei, Mina, Rico, Jonathan Adam, Rieu, ZunHyan, Rink, Markus, Roth, Johannes, Ruiz-Gonzalez, Yusely, Saeed, Numan, Saha, Anindo, Salem, Mostafa, Sanchez-Matilla, Ricardo, Schilling, Kurt, Shao, Wei, Shen, Zhiqiang, Shi, Ruize, Shi, Pengcheng, Sobotka, Daniel, Soulier, Théodore, Fadida, Bella Specktor, Stoyanov, Danail, Mun, Timothy Sum Hon, Sun, Xiaowu, Tao, Rong, Thaler, Franz, Théberge, Antoine, Thielke, Felix, Torres, Helena, Wahid, Kareem A., Wang, Jiacheng, Wang, YiFei, Wang, Wei, Wang, Xiong, Wen, Jianhui, Wen, Ning, Wodzinski, Marek, Wu, Ye, Xia, Fangfang, Xiang, Tianqi, Xiaofei, Chen, Xu, Lizhan, Xue, Tingting, Yang, Yuxuan, Yang, Lin, Yao, Kai, Yao, Huifeng, Yazdani, Amirsaeed, Yip, Michael, Yoo, Hwanseung, Yousefirizi, Fereshteh, Yu, Shunkai, Yu, Lei, Zamora, Jonathan, Zeineldin, Ramy Ashraf, Zeng, Dewen, Zhang, Jianpeng, Zhang, Bokai, Zhang, Jiapeng, Zhang, Fan, Zhang, Huahong, Zhao, Zhongchen, Zhao, Zixuan, Zhao, Jiachen, Zhao, Can, Zheng, Qingshuo, Zhi, Yuheng, Zhou, Ziqi, Zou, Baosheng, Maier-Hein, Klaus, Jäger, Paul F., Kopp-Schneider, Annette, Maier-Hein, Lena
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
Accounting for Variance in Machine Learning Benchmarks
Bouthillier, Xavier, Delaunay, Pierre, Bronzi, Mirko, Trofimov, Assya, Nichyporuk, Brennan, Szeto, Justin, Sepah, Naz, Raff, Edward, Madan, Kanika, Voleti, Vikram, Kahou, Samira Ebrahimi, Michalski, Vincent, Serdyuk, Dmitriy, Arbel, Tal, Pal, Chris, Varoquaux, Gaël, Vincent, Pascal
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.