Schinz, David
Enhancing Interpretability of Vertebrae Fracture Grading using Human-interpretable Prototypes
Sinhamahapatra, Poulami, Shit, Suprosanna, Sekuboyina, Anjany, Husseini, Malek, Schinz, David, Lenhart, Nicolas, Menze, Joern, Kirschke, Jan, Roscher, Karsten, Guennemann, Stephan
Vertebral fracture grading classifies the severity of vertebral fractures, which is a challenging task in medical imaging and has recently attracted Deep Learning (DL) models. Only a few works attempted to make such models human-interpretable despite the need for transparency and trustworthiness in critical use cases like DL-assisted medical diagnosis. Moreover, such models either rely on post-hoc methods or additional annotations. In this work, we propose a novel interpretable-by-design method, ProtoVerse, to find relevant sub-parts of vertebral fractures (prototypes) that reliably explain the model's decision in a human-understandable way. Specifically, we introduce a novel diversity-promoting loss to mitigate prototype repetitions in small datasets with intricate semantics. We have experimented with the VerSe'19 dataset and outperformed the existing prototype-based method. Further, our model provides superior interpretability against the post-hoc method.
Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral Fracture Grading
Keicher, Matthias, Atad, Matan, Schinz, David, Gersing, Alexandra S., Foreman, Sarah C., Goller, Sophia S., Weissinger, Juergen, Rischewski, Jon, Dietrich, Anna-Sophia, Wiestler, Benedikt, Kirschke, Jan S., Navab, Nassir
Vertebral fractures are a consequence of osteoporosis, with significant health implications for affected patients. Unfortunately, grading their severity using CT exams is hard and subjective, motivating automated grading methods. However, current approaches are hindered by imbalance and scarcity of data and a lack of interpretability. To address these challenges, this paper proposes a novel approach that leverages unlabelled data to train a generative Diffusion Autoencoder (DAE) model as an unsupervised feature extractor. We model fracture grading as a continuous regression, which is more reflective of the smooth progression of fractures. Specifically, we use a binary, supervised fracture classifier to construct a hyperplane in the DAE's latent space. We then regress the severity of the fracture as a function of the distance to this hyperplane, calibrating the results to the Genant scale. Importantly, the generative nature of our method allows us to visualize different grades of a given vertebra, providing interpretability and insight into the features that contribute to automated grading.