An Autoencoder and Vision Transformer-based Interpretability Analysis of the Differences in Automated Staging of Second and Third Molars

Buyukcakir, Barkin, De Tobel, Jannick, Thevissen, Patrick, Vandermeulen, Dirk, Claes, Peter

Sep-15-2025–arXiv.org Artificial Intelligence

The practical adoption of deep learning in high-stakes forensic applications, such as dental age estimation, is often limited by the 'black box' nature of the models. This study introduces a framework designed to enhance both performance and transparency in this context. We use a notable performance disparity in the automated staging of mandibular second (tooth 37) and third (tooth 38) molars as a case study. The proposed framework, which combines a convolutional autoencoder (AE) with a Vision Transformer (ViT), improves classification accuracy for both teeth over a baseline ViT, increasing from 0.712 to 0.815 for tooth 37 and from 0.462 to 0.543 for tooth 38. Beyond improving performance, the framework provides multi-faceted diagnostic insights. Analysis of the AE's latent space metrics and image reconstructions indicates that the remaining performance gap is data-centric, suggesting high intra-class morphological variability in the tooth 38 dataset is a primary limiting factor. This work highlights the insufficiency of relying on a single mode of interpretability, such as attention maps, which can appear anatomically plausible yet fail to identify underlying data issues. By offering a methodology that both enhances accuracy and provides evidence for why a model may be uncertain, this framework serves as a more robust tool to support expert decision-making in forensic age estimation.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Sep-15-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Belgium (0.14)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Law (0.93)
- Information Technology (0.68)
- Health & Medicine > Diagnostic Medicine
  - Imaging (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)