Goto

Collaborating Authors

 panoramic radiograph


When CNNs Outperform Transformers and Mambas: Revisiting Deep Architectures for Dental Caries Segmentation

Ghimire, Aashish, Zeng, Jun, Paudel, Roshan, Tomar, Nikhil Kumar, Nayak, Deepak Ranjan, Nalla, Harshith Reddy, Jha, Vivek, Reynolds, Glenda, Jha, Debesh

arXiv.org Artificial Intelligence

Accurate identification and segmentation of dental caries in panoramic radiographs are critical for early diagnosis and effective treatment planning. Automated segmentation remains challenging due to low lesion contrast, morphological variability, and limited annotated data. In this study, we present the first comprehensive benchmarking of convolutional neural networks, vision transformers and state-space mamba architectures for automated dental caries segmentation on panoramic radiographs through a DC1000 dataset. Twelve state-of-the-art architectures, including VMUnet, MambaUNet, VMUNetv2, RMAMamba-S, TransNetR, PVTFormer, DoubleU-Net, and ResUNet++, were trained under identical configurations. Results reveal that, contrary to the growing trend toward complex attention based architectures, the CNN-based DoubleU-Net achieved the highest dice coefficient of 0.7345, mIoU of 0.5978, and precision of 0.8145, outperforming all transformer and Mamba variants. In the study, the top 3 results across all performance metrics were achieved by CNN-based architectures. Here, Mamba and transformer-based methods, despite their theoretical advantage in global context modeling, underperformed due to limited data and weaker spatial priors. These findings underscore the importance of architecture-task alignment in domain-specific medical image segmentation more than model complexity. Our code is available at: https://github.com/JunZengz/dental-caries-segmentation.


Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges

Zhou, Zhenhuan, Zhu, Jingbo, Zhang, Yuchen, Guan, Xiaohang, Wang, Peng, Li, Tao

arXiv.org Artificial Intelligence

Efficient analysis and processing of dental images are crucial for dentists to achieve accurate diagnosis and optimal treatment planning. However, dental imaging inherently poses several challenges, such as low contrast, metallic artifacts, and variations in projection angles. Combined with the subjectivity arising from differences in clinicians' expertise, manual interpretation often proves time-consuming and prone to inconsistency. Artificial intelligence (AI)-based automated dental image analysis (DIA) offers a promising solution to these issues and has become an integral part of computer-aided dental diagnosis and treatment. Among various AI technologies, deep learning (DL) stands out as the most widely applied and influential approach due to its superior feature extraction and representation capabilities. To comprehensively summarize recent progress in this field, we focus on the two fundamental aspects of DL research-datasets and models. In this paper, we systematically review 260 studies on DL applications in DIA, including 49 papers on publicly available dental datasets and 211 papers on DL-based algorithms. We first introduce the basic concepts of dental imaging and summarize the characteristics and acquisition methods of existing datasets. Then, we present the foundational techniques of DL and categorize relevant models and algorithms according to different DIA tasks, analyzing their network architectures, optimization strategies, training methods, and performance. Furthermore, we summarize commonly used training and evaluation metrics in the DIA domain. Finally, we discuss the current challenges of existing research and outline potential future directions. We hope that this work provides a valuable and systematic reference for researchers in this field. All supplementary materials and detailed comparison tables will be made publicly available on GitHub.


Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using GPT-4o: Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

Hosokawa, Nanaka, Takahashi, Ryo, Kitano, Tomoya, Iida, Yukihiro, Muramatsu, Chisako, Hayashi, Tatsuro, Seino, Yuta, Zhou, Xiangrong, Hara, Takeshi, Katsumata, Akitoshi, Fujita, Hiroshi

arXiv.org Artificial Intelligence

In this study, we utilized the multimodal capabilities of OpenAI GPT-4o to automatically generate jaw cyst findings on dental panoramic radiographs. To improve accuracy, we constructed a Self-correction Loop with Structured Output (SLSO) framework and verified its effectiveness. A 10-step process was implemented for 22 cases of jaw cysts, including image input and analysis, structured data generation, tooth number extraction and consistency checking, iterative regeneration when inconsistencies were detected, and finding generation with subsequent restructuring and consistency verification. A comparative experiment was conducted using the conventional Chain-of-Thought (CoT) method across seven evaluation items: transparency, internal structure, borders, root resorption, tooth movement, relationships with other structures, and tooth number. The results showed that the proposed SLSO framework improved output accuracy for many items, with 66.9%, 33.3%, and 28.6% improvement rates for tooth number, tooth movement, and root resorption, respectively. In the successful cases, a consistently structured output was achieved after up to five regenerations. Although statistical significance was not reached because of the small size of the dataset, the overall SLSO framework enforced negative finding descriptions, suppressed hallucinations, and improved tooth number identification accuracy. However, the accurate identification of extensive lesions spanning multiple teeth is limited. Nevertheless, further refinement is required to enhance overall performance and move toward a practical finding generation system.


Impact of Labeling Inaccuracy and Image Noise on Tooth Segmentation in Panoramic Radiographs using Federated, Centralized and Local Learning

Rubak, Johan Andreas Balle, Naveed, Khuram, Jain, Sanyam, Esterle, Lukas, Iosifidis, Alexandros, Pauwels, Ruben

arXiv.org Artificial Intelligence

Objectives: Federated learning (FL) may mitigate privacy constraints, heterogeneous data quality, and inconsistent labeling in dental diagnostic AI. We compared FL with centralized (CL) and local learning (LL) for tooth segmentation in panoramic radiographs across multiple data corruption scenarios. Methods: An Attention U-Net was trained on 2066 radiographs from six institutions across four settings: baseline (unaltered data); label manipulation (dilated/missing annotations); image-quality manipulation (additive Gaussian noise); and exclusion of a faulty client with corrupted data. FL was implemented via the Flower AI framework. Per-client training- and validation-loss trajectories were monitored for anomaly detection and a set of metrics (Dice, IoU, HD, HD95 and ASSD) was evaluated on a hold-out test set. From these metrics significance results were reported through Wilcoxon signed-rank test. CL and LL served as comparators. Results: Baseline: FL achieved a median Dice of 0.94889 (ASSD: 1.33229), slightly better than CL at 0.94706 (ASSD: 1.37074) and LL at 0.93557-0.94026 (ASSD: 1.51910-1.69777). Label manipulation: FL maintained the best median Dice score at 0.94884 (ASSD: 1.46487) versus CL's 0.94183 (ASSD: 1.75738) and LL's 0.93003-0.94026 (ASSD: 1.51910-2.11462). Image noise: FL led with Dice at 0.94853 (ASSD: 1.31088); CL scored 0.94787 (ASSD: 1.36131); LL ranged from 0.93179-0.94026 (ASSD: 1.51910-1.77350). Faulty-client exclusion: FL reached Dice at 0.94790 (ASSD: 1.33113) better than CL's 0.94550 (ASSD: 1.39318). Loss-curve monitoring reliably flagged the corrupted site. Conclusions: FL matches or exceeds CL and outperforms LL across corruption scenarios while preserving privacy. Per-client loss trajectories provide an effective anomaly-detection mechanism and support FL as a practical, privacy-preserving approach for scalable clinical AI deployment.


Advanced Deep Learning Techniques for Classifying Dental Conditions Using Panoramic X-Ray Images

Golkarieh, Alireza, Kiashemshaki, Kiana, Boroujeni, Sajjad Rezvani

arXiv.org Artificial Intelligence

--This study aimed to develop and evaluate multiple deep learning approaches for automated classification of dental conditions in panoramic radiographs, comparing the performance of custom convolutional neural networks (CNNs), hybrid CNN-machine learning models, and fine-tuned pre-trained architectures for detecting fillings, cavities, implants, and impacted teeth. A dataset of 1,512 panoramic dental X-ray images containing 11,137 annotations across four dental conditions was employed, with class imbalance addressed through random down-sampling to create a balanced dataset of 894 samples per condition. Multiple computational approaches were implemented and evaluated using 5-fold cross-validation, including a custom CNN architecture, hybrid models combining CNN feature extraction with traditional machine learning classifiers (Support V ector Machine, Decision Tree, and Random Forest), and three fine-tuned pre-trained architectures (VGG16, Xception, and ResNet50). Performance evaluation was conducted using standard classification metrics including accuracy, precision, recall, and F1-score. The hybrid CNN-Random Forest model achieved the highest performance with 85.4 2.3% accuracy, representing an 11 percentage point improvement over the custom CNN baseline (74.29%). Among pre-trained architectures, VGG16 demonstrated superior performance with 82.3 2.0% accuracy, followed by Xception (80.9 2.3%) and ResNet50 (79.5 2.7%). The CNN+Random Forest model exhibited exceptional performance for fillings detection (F1-score: 0.860 0.033) and maintained balanced classification across all dental conditions. Systematic misclassifica-tion patterns were observed between morphologically similar conditions, particularly cavity-implant and cavity-impacted tooth categories, highlighting the inherent challenges in distinguishing overlapping dental pathologies. Hybrid CNN-based approaches, particularly the combination of CNN feature extraction with Random Forest classification, provide enhanced discriminative capability for automated dental condition detection compared to standalone architectures.


PanoGAN A Deep Generative Model for Panoramic Dental Radiographs

Pedersen, Soren, Jain, Sanyam, Chavez, Mikkel, Ladehoff, Viktor, de Freitas, Bruna Neves, Pauwels, Ruben

arXiv.org Artificial Intelligence

This paper presents the development of a generative adversarial network (GAN) for synthesizing dental panoramic radiographs. Although exploratory in nature, the study aims to address the scarcity of data in dental research and education. We trained a deep convolutional GAN (DCGAN) using a Wasserstein loss with gradient penalty (WGANGP) on a dataset of 2322 radiographs of varying quality. The focus was on the dentoalveolar regions, other anatomical structures were cropped out. Extensive preprocessing and data cleaning were performed to standardize the inputs while preserving anatomical variability. We explored four candidate models by varying critic iterations, feature depth, and the use of denoising prior to training. A clinical expert evaluated the generated radiographs based on anatomical visibility and realism, using a 5-point scale (1 very poor 5 excellent). Most images showed moderate anatomical depiction, although some were degraded by artifacts. A trade-off was observed the model trained on non-denoised data yielded finer details especially in structures like the mandibular canal and trabecular bone, while a model trained on denoised data offered superior overall image clarity and sharpness. These findings provide a foundation for future work on GAN-based methods in dental imaging.


NAADA: A Noise-Aware Attention Denoising Autoencoder for Dental Panoramic Radiographs

Naveed, Khuram, de Freitas, Bruna Neves, Pauwels, Ruben

arXiv.org Artificial Intelligence

Convolutional denoising autoencoders (DAEs) are powerful tools for image restoration. However, they inherit a key limitation of convolutional neural networks (CNNs): they tend to recover low-frequency features, such as smooth regions, more effectively than high-frequency details. This leads to the loss of fine details, which is particularly problematic in dental radiographs where preserving subtle anatomical structures is crucial. While self-attention mechanisms can help mitigate this issue by emphasizing important features, conventional attention methods often prioritize features corresponding to cleaner regions and may overlook those obscured by noise. To address this limitation, we propose a noise-aware self-attention method, which allows the model to effectively focus on and recover key features even within noisy regions. Building on this approach, we introduce the noise-aware attention-enhanced denoising autoencoder (NAADA) network for enhancing noisy panoramic dental radiographs. Compared with the recent state of the art (and much heavier) methods like Uformer, MResDNN etc., our method improves the reconstruction of fine details, ensuring better image quality and diagnostic accuracy.


Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation

Silva, Bernardo, Fontinele, Jefferson, Vieira, Carolina Letícia Zilli, Tavares, João Manuel R. S., Cury, Patricia Ramos, Oliveira, Luciano

arXiv.org Artificial Intelligence

Imaging modalities like X-rays, computerized tomography scans, and magnetic resonance imaging provide detailed views of teeth, bones, and soft tissues (White and Pharoah, 2014). These tools enhance the precision of diagnoses and treatments, ensuring better patient outcomes. Among the current imaging exams, radiographs are the most common in dentistry (White and Pharoah, 2014; Langlais and Miller, 2016), being requested to identify various pathologies like cavities, periodontal disease, impacted teeth, and bone infections (Chang et al., 2020; Yüksel et al., 2021) and track the progress of dental treatments. One of the most commonly used radiographs in dentistry is the panoramic radiograph (White and Pharoah, 2014; Langlais and Miller, 2016; Silva et al., 2018), which is an extraoral imaging technique where the X-ray film or sensor remains outside the patient's mouth during acquisition. In a single image, the panoramic radiograph provides a comprehensive view of both upper and lower jaws, but with less detail of the mouth structures (Haring and Jansen, 2000; Silva et al., 2018; Jader et al., 2018; Pinheiro et al., 2021). Figure 1 depicts an example of a panoramic radiograph, revealing the structures and their overlaps, which can lead to cluttered readings.


Exploring the Role of Convolutional Neural Networks (CNN) in Dental Radiography Segmentation: A Comprehensive Systematic Literature Review

Brahmi, Walid, Jdey, Imen, Drira, Fadoua

arXiv.org Artificial Intelligence

In the field of dentistry, there is a growing demand for increased precision in diagnostic tools, with a specific focus on advanced imaging techniques such as computed tomography, cone beam computed tomography, magnetic resonance imaging, ultrasound, and traditional intra-oral periapical X-rays. Deep learning has emerged as a pivotal tool in this context, enabling the implementation of automated segmentation techniques crucial for extracting essential diagnostic data. This integration of cutting-edge technology addresses the urgent need for effective management of dental conditions, which, if left undetected, can have a significant impact on human health. The impressive track record of deep learning across various domains, including dentistry, underscores its potential to revolutionize early detection and treatment of oral health issues. Objective: Having demonstrated significant results in diagnosis and prediction, deep convolutional neural networks (CNNs) represent an emerging field of multidisciplinary research. The goals of this study were to provide a concise overview of the state of the art, standardize the current debate, and establish baselines for future research. Method: In this study, a systematic literature review is employed as a methodology to identify and select relevant studies that specifically investigate the deep learning technique for dental imaging analysis. This study elucidates the methodological approach, including the systematic collection of data, statistical analysis, and subsequent dissemination of outcomes. Conclusion: This work demonstrates how Convolutional Neural Networks (CNNs) can be employed to analyze images, serving as effective tools for detecting dental pathologies. Although this research acknowledged some limitations, CNNs utilized for segmenting and categorizing teeth exhibited their highest level of performance overall.


3D Teeth Reconstruction from Panoramic Radiographs using Neural Implicit Functions

Park, Sihwa, Kim, Seongjun, Song, In-Seok, Baek, Seung Jun

arXiv.org Artificial Intelligence

Panoramic radiography is a widely used imaging modality in dental practice and research. However, it only provides flattened 2D images, which limits the detailed assessment of dental structures. In this paper, we propose Occudent, a framework for 3D teeth reconstruction from panoramic radiographs using neural implicit functions, which, to the best of our knowledge, is the first work to do so. For a given point in 3D space, the implicit function estimates whether the point is occupied by a tooth, and thus implicitly determines the boundaries of 3D tooth shapes. Firstly, Occudent applies multi-label segmentation to the input panoramic radiograph. Next, tooth shape embeddings as well as tooth class embeddings are generated from the segmentation outputs, which are fed to the reconstruction network. A novel module called Conditional eXcitation (CX) is proposed in order to effectively incorporate the combined shape and class embeddings into the implicit function. The performance of Occudent is evaluated using both quantitative and qualitative measures. Importantly, Occudent is trained and validated with actual panoramic radiographs as input, distinct from recent works which used synthesized images. Experiments demonstrate the superiority of Occudent over state-of-the-art methods.