Jaus, Alexander
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision
Li, Jianning, Zhou, Zongwei, Yang, Jiancheng, Pepe, Antonio, Gsaxner, Christina, Luijten, Gijs, Qu, Chongyu, Zhang, Tiezheng, Chen, Xiaoxi, Li, Wenxuan, Wodzinski, Marek, Friedrich, Paul, Xie, Kangxian, Jin, Yuan, Ambigapathy, Narmada, Nasca, Enrico, Solak, Naida, Melito, Gian Marco, Vu, Viet Duc, Memon, Afaque R., Schlachta, Christopher, De Ribaupierre, Sandrine, Patel, Rajnikant, Eagleson, Roy, Chen, Xiaojun, Mächler, Heinrich, Kirschke, Jan Stefan, de la Rosa, Ezequiel, Christ, Patrick Ferdinand, Li, Hongwei Bran, Ellis, David G., Aizenberg, Michele R., Gatidis, Sergios, Küstner, Thomas, Shusharina, Nadya, Heller, Nicholas, Andrearczyk, Vincent, Depeursinge, Adrien, Hatt, Mathieu, Sekuboyina, Anjany, Löffler, Maximilian, Liebl, Hans, Dorent, Reuben, Vercauteren, Tom, Shapey, Jonathan, Kujawa, Aaron, Cornelissen, Stefan, Langenhuizen, Patrick, Ben-Hamadou, Achraf, Rekik, Ahmed, Pujades, Sergi, Boyer, Edmond, Bolelli, Federico, Grana, Costantino, Lumetti, Luca, Salehi, Hamidreza, Ma, Jun, Zhang, Yao, Gharleghi, Ramtin, Beier, Susann, Sowmya, Arcot, Garza-Villarreal, Eduardo A., Balducci, Thania, Angeles-Valdez, Diego, Souza, Roberto, Rittner, Leticia, Frayne, Richard, Ji, Yuanfeng, Ferrari, Vincenzo, Chatterjee, Soumick, Dubost, Florian, Schreiber, Stefanie, Mattern, Hendrik, Speck, Oliver, Haehn, Daniel, John, Christoph, Nürnberger, Andreas, Pedrosa, João, Ferreira, Carlos, Aresta, Guilherme, Cunha, António, Campilho, Aurélio, Suter, Yannick, Garcia, Jose, Lalande, Alain, Vandenbossche, Vicky, Van Oevelen, Aline, Duquesne, Kate, Mekhzoum, Hamza, Vandemeulebroucke, Jef, Audenaert, Emmanuel, Krebs, Claudia, van Leeuwen, Timo, Vereecke, Evie, Heidemeyer, Hauke, Röhrig, Rainer, Hölzle, Frank, Badeli, Vahid, Krieger, Kathrin, Gunzer, Matthias, Chen, Jianxu, van Meegdenburg, Timo, Dada, Amin, Balzer, Miriam, Fragemann, Jana, Jonske, Frederic, Rempe, Moritz, Malorodov, Stanislav, Bahnsen, Fin H., Seibold, Constantin, Jaus, Alexander, Marinov, Zdravko, Jaeger, Paul F., Stiefelhagen, Rainer, Santos, Ana Sofia, Lindo, Mariana, Ferreira, André, Alves, Victor, Kamp, Michael, Abourayya, Amr, Nensa, Felix, Hörst, Fabian, Brehmer, Alexander, Heine, Lukas, Hanusrichter, Yannik, Weßling, Martin, Dudda, Marcel, Podleska, Lars E., Fink, Matthias A., Keyl, Julius, Tserpes, Konstantinos, Kim, Moon-Sung, Elhabian, Shireen, Lamecker, Hans, Zukić, Dženan, Paniagua, Beatriz, Wachinger, Christian, Urschler, Martin, Duong, Luc, Wasserthal, Jakob, Hoyer, Peter F., Basu, Oliver, Maal, Thomas, Witjes, Max J. H., Schiele, Gregor, Chang, Ti-chiun, Ahmadi, Seyed-Ahmad, Luo, Ping, Menze, Bjoern, Reyes, Mauricio, Deserno, Thomas M., Davatzikos, Christos, Puladi, Behrus, Fua, Pascal, Yuille, Alan L., Kleesiek, Jens, Egger, Jan
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback
Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling
Seibold, Constantin, Jaus, Alexander, Fink, Matthias A., Kim, Moon, Reiß, Simon, Herrmann, Ken, Kleesiek, Jens, Stiefelhagen, Rainer
Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlapping structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs with 157 labels and applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. These labels were projected onto a two-dimensional plane, similar to the CXR, allowing the training of detailed semantic segmentation models for CXR without any manual annotation effort. Our anatomical segmentations allowed for the accurate extraction of relevant explainable medical features such as the cardio-thoracic-ratio. Conclusion: Our method of volumetric pseudo-labeling paired with CT projection offers a promising approach for detailed anatomical segmentation of CXR with a high agreement with human annotators. This technique may have important clinical implications, particularly in the analysis of various thoracic pathologies. Chest radiographs (CXR) are one of the most common diagnostic imaging methods for patients with respiratory or cardiovascular conditions, with more than 130 million studies performed annually in Germany alone [1]. By using ionizing radiation to penetrate the body, CXR provide a visual representation of the organs, tissues and cavities and their current state. The interpretation of CXR these images is challenging, since it requires a thorough understanding of human anatomy due to the presence of overlapping structures that can obscure potential pathological changes and other abnormalities. Despite these challenges, CXR remain a standard diagnostic procedure and its quantitative analysis can be time consuming. With the increasing demand in imaging procedures and the massive workload that comes along with it, this can lead to avoidable errors due to rushed examination [2, 3] or burnout due to the overly straining of doctors [4-6]. Recent advances in Computer Vision, such as convolutional neural networks (CNN) or vision transformers (ViT), have the potential to reduce the workload of radiologists in both image analysis and reporting [7, 8].
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning
Jaus, Alexander, Yang, Kailun, Stiefelhagen, Rainer
Figure 1: Within this work, we differentiate between various levels of image understanding: The original image (first row, left) can be interpreted as a panoramic semantic map (second row, left) by assigning a label to each pixel without differentiating between different instances of countable objects. Instances of countable objects are distinguished in the panoramic instance understanding (second row, right). The panoramic panoptic understanding (first row, right), which is the proposed method in this paper, builds on top of the previous understandings by eliminating their shortcomings: If possible different instances are distinguished and we guarantee that a label is assigned to each pixel. Abstract--In this work, we introduce panoramic panoptic combining supervised and contrastive training. A complete surrounding understanding provides a maximum of information to a mobile agent. ANOPTIC segmentation is the so far most complete segmentation task to describe the context of an image [1]. The domain shift from pinhole-to panoramic images is no exception. These properties have not been observed by the model during the training and make their correct segmentation Field of View challenging. Feature (PRF) framework which allows us to generate robust backbones via a contrastive pretext task. This poses severe problems due to the lack of does not only encourage similar features to be represented information containing the entire surrounding and the inability in a similar manner but more important, it pushes dissimilar of the agent to make proper decisions which may even lead features away from each other [16], [17]. This leads to well to accidents [5]. Thus, both pieces of information are equally separated clusters in the latent space of the backbone which important: the image should cover the entire surrounding and proves to mitigate distribution shift performance drops.