Engelhardt, Sandy
MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis
Seyfarth, Marvin, Dar, Salman Ul Hassan, Ayx, Isabelle, Fink, Matthias Alexander, Schoenberg, Stefan O., Kauczor, Hans-Ulrich, Engelhardt, Sandy
Advancements in AI for medical imaging offer significant potential. However, their applications are constrained by the limited availability of data and the reluctance of medical centers to share it due to patient privacy concerns. Generative models present a promising solution by creating synthetic data as a substitute for real patient data. However, medical images are typically high-dimensional, and current state-of-the-art methods are often impractical for computational resource-constrained healthcare environments. These models rely on data sub-sampling, raising doubts about their feasibility and real-world applicability. Furthermore, many of these models are evaluated on quantitative metrics that alone can be misleading in assessing the image quality and clinical meaningfulness of the generated images. To address this, we introduce MedLoRD, a generative diffusion model designed for computational resource-constrained environments. MedLoRD is capable of generating high-dimensional medical volumes with resolutions up to 512$\times$512$\times$256, utilizing GPUs with only 24GB VRAM, which are commonly found in standard desktop workstations. MedLoRD is evaluated across multiple modalities, including Coronary Computed Tomography Angiography and Lung Computed Tomography datasets. Extensive evaluations through radiological evaluation, relative regional volume analysis, adherence to conditional masks, and downstream tasks show that MedLoRD generates high-fidelity images closely adhering to segmentation mask conditions, surpassing the capabilities of current state-of-the-art generative models for medical image synthesis in computational resource-constrained environments.
Arbitrary Data as Images: Fusion of Patient Data Across Modalities and Irregular Intervals with Vision Transformers
Tรถlle, Malte, Scharaf, Mohamad, Fischer, Samantha, Reich, Christoph, Zeid, Silav, Dieterich, Christoph, Meder, Benjamin, Frey, Norbert, Wild, Philipp, Engelhardt, Sandy
A patient undergoes multiple examinations in each hospital stay, where each provides different facets of the health status. These assessments include temporal data with varying sampling rates, discrete single-point measurements, therapeutic interventions such as medication administration, and images. While physicians are able to process and integrate diverse modalities intuitively, neural networks need specific modeling for each modality complicating the training procedure. We demonstrate that this complexity can be significantly reduced by visualizing all information as images along with unstructured text and subsequently training a conventional vision-text transformer. Our approach, Vision Transformer for irregular sampled Multi-modal Measurements (ViTiMM), not only simplifies data preprocessing and modeling but also outperforms current state-of-the-art methods in predicting in-hospital mortality and phenotyping, as evaluated on 6,175 patients from the MIMIC-IV dataset. The modalities include patient's clinical measurements, medications, X-ray images, and electrocardiography scans. We hope our work inspires advancements in multi-modal medical AI by reducing the training complexity to (visual) prompt engineering, thus lowering entry barriers and enabling no-code solutions for training. The source code will be made publicly available.
Multi-Modal Dataset Creation for Federated~Learning with DICOM Structured Reports
Tรถlle, Malte, Burger, Lukas, Kelm, Halvar, Andrรฉ, Florian, Bannas, Peter, Diller, Gerhard, Frey, Norbert, Garthe, Philipp, Groร, Stefan, Hennemuth, Anja, Kaderali, Lars, Krรผger, Nina, Leha, Andreas, Martin, Simon, Meyer, Alexander, Nagel, Eike, Orwat, Stefan, Scherer, Clemens, Seiffert, Moritz, Seliger, Jan Moritz, Simm, Stefan, Friede, Tim, Seidler, Tim, Engelhardt, Sandy
Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.
FUNAvg: Federated Uncertainty Weighted Averaging for Datasets with Diverse Labels
Tรถlle, Malte, Navarro, Fernando, Eble, Sebastian, Wolf, Ivo, Menze, Bjoern, Engelhardt, Sandy
Federated learning is one popular paradigm to train a joint model in a distributed, privacy-preserving environment. But partial annotations pose an obstacle meaning that categories of labels are heterogeneous over clients. We propose to learn a joint backbone in a federated manner, while each site receives its own multi-label segmentation head. By using Bayesian techniques we observe that the different segmentation heads although only trained on the individual client's labels also learn information about the other labels not present at the respective site. This information is encoded in their predictive uncertainty. To obtain a final prediction we leverage this uncertainty and perform a weighted averaging of the ensemble of distributed segmentation heads, which allows us to segment "locally unknown" structures. With our method, which we refer to as FUNAvg, we are even on-par with the models trained and tested on the same dataset on average.
Unconditional Latent Diffusion Models Memorize Patient Imaging Data
Dar, Salman Ul Hassan, Seyfarth, Marvin, Kahmann, Jannik, Ayx, Isabelle, Papavassiliu, Theano, Schoenberg, Stefan O., Engelhardt, Sandy
Generative latent diffusion models hold a wide range of applications in the medical imaging domain. A noteworthy application is privacy-preserved open-data sharing by proposing synthetic data as surrogates of real patient data. Despite the promise, these models are susceptible to patient data memorization, where models generate patient data copies instead of novel synthetic samples. This undermines the whole purpose of preserving patient data and may even result in patient re-identification. Considering the importance of the problem, surprisingly it has received relatively little attention in the medical imaging community. To this end, we assess memorization in latent diffusion models for medical image synthesis. We train 2D and 3D latent diffusion models on CT, MR, and X-ray datasets for synthetic data generation. Afterwards, we examine the amount of training data memorized utilizing self-supervised models and further investigate various factors that can possibly lead to memorization by training models in different settings. We observe a surprisingly large amount of data memorization among all datasets, with up to 41.7%, 19.6%, and 32.6% of the training data memorized in CT, MRI, and X-ray datasets respectively. Further analyses reveal that increasing training data size and using data augmentation reduce memorization, while over-training enhances it. Overall, our results suggest a call for memorization-informed evaluation of synthetic data prior to open-data sharing.
Surgical Phase and Instrument Recognition: How to identify appropriate Dataset Splits
Kostiuchik, Georgii, Sharan, Lalith, Mayer, Benedikt, Wolf, Ivo, Preim, Bernhard, Engelhardt, Sandy
Purpose: Machine learning models can only be reliably evaluated if training, validation, and test data splits are representative and not affected by the absence of classes of interest. Surgical workflow and instrument recognition tasks are complicated in this manner, because of heavy data imbalances resulting from different lengths of phases and their erratic occurrences. Furthermore, the issue becomes difficult as sub-properties that help define phases, like instrument (co-)occurrence, are usually not considered when defining the split. We argue that such sub-properties must be equally considered. Methods: This work presents a publicly available data visualization tool that enables interactive exploration of dataset splits for surgical phase and instrument recognition. It focuses on the visualization of the occurrence of phases, phase transitions, instruments, and instrument combinations across sets. Particularly, it facilitates the assessment and identification of sub-optimal dataset splits. Results: We performed an analysis of common Cholec80 dataset splits using the proposed application and were able to uncover phase transitions and combinations of instruments that were not represented in one of the sets. Additionally, we outlined possible improvements to the splits. A user study with ten participants demonstrated the ability of participants to solve a selection of data exploration tasks using the proposed application. Conclusion: In highly unbalanced class distributions, special care should be taken with respect to the selection of an appropriate dataset split. Our interactive data visualization tool presents a promising approach for the assessment of dataset splits for surgical phase and instrument recognition. Evaluation results show that it can enhance the development of machine learning models. The application is available at https://cardio-ai.github.io/endovis-ml/ .
Biomedical image analysis competitions: The state of current participation practice
Eisenmann, Matthias, Reinke, Annika, Weru, Vivienn, Tizabi, Minu Dietlinde, Isensee, Fabian, Adler, Tim J., Godau, Patrick, Cheplygina, Veronika, Kozubek, Michal, Ali, Sharib, Gupta, Anubha, Kybic, Jan, Noble, Alison, de Solรณrzano, Carlos Ortiz, Pachade, Samiksha, Petitjean, Caroline, Sage, Daniel, Wei, Donglai, Wilden, Elizabeth, Alapatt, Deepak, Andrearczyk, Vincent, Baid, Ujjwal, Bakas, Spyridon, Balu, Niranjan, Bano, Sophia, Bawa, Vivek Singh, Bernal, Jorge, Bodenstedt, Sebastian, Casella, Alessandro, Choi, Jinwook, Commowick, Olivier, Daum, Marie, Depeursinge, Adrien, Dorent, Reuben, Egger, Jan, Eichhorn, Hannah, Engelhardt, Sandy, Ganz, Melanie, Girard, Gabriel, Hansen, Lasse, Heinrich, Mattias, Heller, Nicholas, Hering, Alessa, Huaulmรฉ, Arnaud, Kim, Hyunjeong, Landman, Bennett, Li, Hongwei Bran, Li, Jianning, Ma, Jun, Martel, Anne, Martรญn-Isla, Carlos, Menze, Bjoern, Nwoye, Chinedu Innocent, Oreiller, Valentin, Padoy, Nicolas, Pati, Sarthak, Payette, Kelly, Sudre, Carole, van Wijnen, Kimberlin, Vardazaryan, Armine, Vercauteren, Tom, Wagner, Martin, Wang, Chuanbo, Yap, Moi Hoon, Yu, Zeyun, Yuan, Chun, Zenk, Maximilian, Zia, Aneeq, Zimmerer, David, Bao, Rina, Choi, Chanyeol, Cohen, Andrew, Dzyubachyk, Oleh, Galdran, Adrian, Gan, Tianyuan, Guo, Tianqi, Gupta, Pradyumna, Haithami, Mahmood, Ho, Edward, Jang, Ikbeom, Li, Zhili, Luo, Zhengbo, Lux, Filip, Makrogiannis, Sokratis, Mรผller, Dominik, Oh, Young-tack, Pang, Subeen, Pape, Constantin, Polat, Gorkem, Reed, Charlotte Rosalie, Ryu, Kanghyun, Scherr, Tim, Thambawita, Vajira, Wang, Haoyu, Wang, Xinliang, Xu, Kele, Yeh, Hung, Yeo, Doyeob, Yuan, Yixuan, Zeng, Yan, Zhao, Xin, Abbing, Julian, Adam, Jannes, Adluru, Nagesh, Agethen, Niklas, Ahmed, Salman, Khalil, Yasmina Al, Alenyร , Mireia, Alhoniemi, Esa, An, Chengyang, Anwar, Talha, Arega, Tewodros Weldebirhan, Avisdris, Netanell, Aydogan, Dogu Baran, Bai, Yingbin, Calisto, Maria Baldeon, Basaran, Berke Doga, Beetz, Marcel, Bian, Cheng, Bian, Hao, Blansit, Kevin, Bloch, Louise, Bohnsack, Robert, Bosticardo, Sara, Breen, Jack, Brudfors, Mikael, Brรผngel, Raphael, Cabezas, Mariano, Cacciola, Alberto, Chen, Zhiwei, Chen, Yucong, Chen, Daniel Tianming, Cho, Minjeong, Choi, Min-Kook, Xie, Chuantao Xie Chuantao, Cobzas, Dana, Cohen-Adad, Julien, Acero, Jorge Corral, Das, Sujit Kumar, de Oliveira, Marcela, Deng, Hanqiu, Dong, Guiming, Doorenbos, Lars, Efird, Cory, Escalera, Sergio, Fan, Di, Serj, Mehdi Fatan, Fenneteau, Alexandre, Fidon, Lucas, Filipiak, Patryk, Finzel, Renรฉ, Freitas, Nuno R., Friedrich, Christoph M., Fulton, Mitchell, Gaida, Finn, Galati, Francesco, Galazis, Christoforos, Gan, Chang Hee, Gao, Zheyao, Gao, Shengbo, Gazda, Matej, Gerats, Beerend, Getty, Neil, Gibicar, Adam, Gifford, Ryan, Gohil, Sajan, Grammatikopoulou, Maria, Grzech, Daniel, Gรผley, Orhun, Gรผnnemann, Timo, Guo, Chunxu, Guy, Sylvain, Ha, Heonjin, Han, Luyi, Han, Il Song, Hatamizadeh, Ali, He, Tian, Heo, Jimin, Hitziger, Sebastian, Hong, SeulGi, Hong, SeungBum, Huang, Rian, Huang, Ziyan, Huellebrand, Markus, Huschauer, Stephan, Hussain, Mustaffa, Inubushi, Tomoo, Polat, Ece Isik, Jafaritadi, Mojtaba, Jeong, SeongHun, Jian, Bailiang, Jiang, Yuanhong, Jiang, Zhifan, Jin, Yueming, Joshi, Smriti, Kadkhodamohammadi, Abdolrahim, Kamraoui, Reda Abdellah, Kang, Inha, Kang, Junghwa, Karimi, Davood, Khademi, April, Khan, Muhammad Irfan, Khan, Suleiman A., Khantwal, Rishab, Kim, Kwang-Ju, Kline, Timothy, Kondo, Satoshi, Kontio, Elina, Krenzer, Adrian, Kroviakov, Artem, Kuijf, Hugo, Kumar, Satyadwyoom, La Rosa, Francesco, Lad, Abhi, Lee, Doohee, Lee, Minho, Lena, Chiara, Li, Hao, Li, Ling, Li, Xingyu, Liao, Fuyuan, Liao, KuanLun, Oliveira, Arlindo Limede, Lin, Chaonan, Lin, Shan, Linardos, Akis, Linguraru, Marius George, Liu, Han, Liu, Tao, Liu, Di, Liu, Yanling, Lourenรงo-Silva, Joรฃo, Lu, Jingpei, Lu, Jiangshan, Luengo, Imanol, Lund, Christina B., Luu, Huan Minh, Lv, Yi, Lv, Yi, Macar, Uzay, Maechler, Leon, L., Sina Mansour, Marshall, Kenji, Mazher, Moona, McKinley, Richard, Medela, Alfonso, Meissen, Felix, Meng, Mingyuan, Miller, Dylan, Mirjahanmardi, Seyed Hossein, Mishra, Arnab, Mitha, Samir, Mohy-ud-Din, Hassan, Mok, Tony Chi Wing, Murugesan, Gowtham Krishnan, Karthik, Enamundram Naga, Nalawade, Sahil, Nalepa, Jakub, Naser, Mohamed, Nateghi, Ramin, Naveed, Hammad, Nguyen, Quang-Minh, Quoc, Cuong Nguyen, Nichyporuk, Brennan, Oliveira, Bruno, Owen, David, Pal, Jimut Bahan, Pan, Junwen, Pan, Wentao, Pang, Winnie, Park, Bogyu, Pawar, Vivek, Pawar, Kamlesh, Peven, Michael, Philipp, Lena, Pieciak, Tomasz, Plotka, Szymon, Plutat, Marcel, Pourakpour, Fattaneh, Preloลพnik, Domen, Punithakumar, Kumaradevan, Qayyum, Abdul, Queirรณs, Sandro, Rahmim, Arman, Razavi, Salar, Ren, Jintao, Rezaei, Mina, Rico, Jonathan Adam, Rieu, ZunHyan, Rink, Markus, Roth, Johannes, Ruiz-Gonzalez, Yusely, Saeed, Numan, Saha, Anindo, Salem, Mostafa, Sanchez-Matilla, Ricardo, Schilling, Kurt, Shao, Wei, Shen, Zhiqiang, Shi, Ruize, Shi, Pengcheng, Sobotka, Daniel, Soulier, Thรฉodore, Fadida, Bella Specktor, Stoyanov, Danail, Mun, Timothy Sum Hon, Sun, Xiaowu, Tao, Rong, Thaler, Franz, Thรฉberge, Antoine, Thielke, Felix, Torres, Helena, Wahid, Kareem A., Wang, Jiacheng, Wang, YiFei, Wang, Wei, Wang, Xiong, Wen, Jianhui, Wen, Ning, Wodzinski, Marek, Wu, Ye, Xia, Fangfang, Xiang, Tianqi, Xiaofei, Chen, Xu, Lizhan, Xue, Tingting, Yang, Yuxuan, Yang, Lin, Yao, Kai, Yao, Huifeng, Yazdani, Amirsaeed, Yip, Michael, Yoo, Hwanseung, Yousefirizi, Fereshteh, Yu, Shunkai, Yu, Lei, Zamora, Jonathan, Zeineldin, Ramy Ashraf, Zeng, Dewen, Zhang, Jianpeng, Zhang, Bokai, Zhang, Jiapeng, Zhang, Fan, Zhang, Huahong, Zhao, Zhongchen, Zhao, Zixuan, Zhao, Jiachen, Zhao, Can, Zheng, Qingshuo, Zhi, Yuheng, Zhou, Ziqi, Zou, Baosheng, Maier-Hein, Klaus, Jรคger, Paul F., Kopp-Schneider, Annette, Maier-Hein, Lena
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
Why is the winner the best?
Eisenmann, Matthias, Reinke, Annika, Weru, Vivienn, Tizabi, Minu Dietlinde, Isensee, Fabian, Adler, Tim J., Ali, Sharib, Andrearczyk, Vincent, Aubreville, Marc, Baid, Ujjwal, Bakas, Spyridon, Balu, Niranjan, Bano, Sophia, Bernal, Jorge, Bodenstedt, Sebastian, Casella, Alessandro, Cheplygina, Veronika, Daum, Marie, de Bruijne, Marleen, Depeursinge, Adrien, Dorent, Reuben, Egger, Jan, Ellis, David G., Engelhardt, Sandy, Ganz, Melanie, Ghatwary, Noha, Girard, Gabriel, Godau, Patrick, Gupta, Anubha, Hansen, Lasse, Harada, Kanako, Heinrich, Mattias, Heller, Nicholas, Hering, Alessa, Huaulmรฉ, Arnaud, Jannin, Pierre, Kavur, Ali Emre, Kodym, Oldลich, Kozubek, Michal, Li, Jianning, Li, Hongwei, Ma, Jun, Martรญn-Isla, Carlos, Menze, Bjoern, Noble, Alison, Oreiller, Valentin, Padoy, Nicolas, Pati, Sarthak, Payette, Kelly, Rรคdsch, Tim, Rafael-Patiรฑo, Jonathan, Bawa, Vivek Singh, Speidel, Stefanie, Sudre, Carole H., van Wijnen, Kimberlin, Wagner, Martin, Wei, Donglai, Yamlahi, Amine, Yap, Moi Hoon, Yuan, Chun, Zenk, Maximilian, Zia, Aneeq, Zimmerer, David, Aydogan, Dogu Baran, Bhattarai, Binod, Bloch, Louise, Brรผngel, Raphael, Cho, Jihoon, Choi, Chanyeol, Dou, Qi, Ezhov, Ivan, Friedrich, Christoph M., Fuller, Clifton, Gaire, Rebati Raman, Galdran, Adrian, Faura, รlvaro Garcรญa, Grammatikopoulou, Maria, Hong, SeulGi, Jahanifar, Mostafa, Jang, Ikbeom, Kadkhodamohammadi, Abdolrahim, Kang, Inha, Kofler, Florian, Kondo, Satoshi, Kuijf, Hugo, Li, Mingxing, Luu, Minh Huan, Martinฤiฤ, Tomaลพ, Morais, Pedro, Naser, Mohamed A., Oliveira, Bruno, Owen, David, Pang, Subeen, Park, Jinah, Park, Sung-Hong, Pลotka, Szymon, Puybareau, Elodie, Rajpoot, Nasir, Ryu, Kanghyun, Saeed, Numan, Shephard, Adam, Shi, Pengcheng, ล tepec, Dejan, Subedi, Ronast, Tochon, Guillaume, Torres, Helena R., Urien, Helene, Vilaรงa, Joรฃo L., Wahid, Kareem Abdul, Wang, Haojie, Wang, Jiacheng, Wang, Liansheng, Wang, Xiyue, Wiestler, Benedikt, Wodzinski, Marek, Xia, Fangfang, Xie, Juanying, Xiong, Zhiwei, Yang, Sen, Yang, Yanwu, Zhao, Zixuan, Maier-Hein, Klaus, Jรคger, Paul F., Kopp-Schneider, Annette, Maier-Hein, Lena
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
mvHOTA: A multi-view higher order tracking accuracy metric to measure spatial and temporal associations in multi-point detection
Sharan, Lalith, Kelm, Halvar, Romano, Gabriele, Karck, Matthias, De Simone, Raffaele, Engelhardt, Sandy
Multi-point tracking is a challenging task that involves detecting points in the scene and tracking them across a sequence of frames. Computing detection-based measures like the F-measure on a frame-by-frame basis is not sufficient to assess the overall performance, as it does not interpret performance in the temporal domain. The main evaluation metric available comes from Multi-object tracking (MOT) methods to benchmark performance on datasets such as KITTI with the recently proposed higher order tracking accuracy (HOTA) metric, which is capable of providing a better description of the performance over metrics such as MOTA, DetA, and IDF1. While the HOTA metric takes into account temporal associations, it does not provide a tailored means to analyse the spatial associations of a dataset in a multi-camera setup. Moreover, there are differences in evaluating the detection task for points when compared to objects (point distances vs. bounding box overlap). Therefore in this work, we propose a multi-view higher order tracking metric (mvHOTA) to determine the accuracy of multi-point (multi-instance and multi-class) tracking methods, while taking into account temporal and spatial associations.mvHOTA can be interpreted as the geometric mean of detection, temporal, and spatial associations, thereby providing equal weighting to each of the factors. We demonstrate the use of this metric to evaluate the tracking performance on an endoscopic point detection dataset from a previously organised surgical data science challenge. Furthermore, we compare with other adjusted MOT metrics for this use-case, discuss the properties of mvHOTA, and show how the proposed multi-view Association and the Occlusion index (OI) facilitate analysis of methods with respect to handling of occlusions. The code is available at https://github.com/Cardio-AI/mvhota.
Medical Diffusion: Denoising Diffusion Probabilistic Models for 3D Medical Image Generation
Khader, Firas, Mueller-Franzes, Gustav, Arasteh, Soroosh Tayebi, Han, Tianyu, Haarburger, Christoph, Schulze-Hagen, Maximilian, Schad, Philipp, Engelhardt, Sandy, Baessler, Bettina, Foersch, Sebastian, Stegmaier, Johannes, Kuhl, Christiane, Nebelung, Sven, Kather, Jakob Nikolas, Truhn, Daniel
Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models in particular have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen and Stable Diffusion. However, their use in medicine, where image data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy preserving artificial intelligence and can also be used to augment small datasets. Here we show that diffusion probabilistic models can synthesize high quality medical imaging data, which we show for Magnetic Resonance Images (MRI) and Computed Tomography (CT) images. We provide quantitative measurements of their performance through a reader study with two medical experts who rated the quality of the synthesized images in three categories: Realistic image appearance, anatomical correctness and consistency between slices. Furthermore, we demonstrate that synthetic images can be used in a self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (dice score 0.91 vs. 0.95 without vs. with synthetic data). The code is publicly available on GitHub: https://github.com/FirasGit/medicaldiffusion.