densenet121
General vs Domain-Specific CNNs: Understanding Pretraining Effects on Brain MRI Tumor Classification
Abedini, Helia, Rahimi, Saba, Vaziri, Reza
Brain tumor detection from MRI scans plays a crucial role in early diagnosis and treatment planning. Deep convolutional neural networks (CNNs) have demonstrated strong performance in medical imaging tasks, particularly when pretrained on large datasets. However, it remains unclear which type of pretrained model performs better when only a small dataset is available: those trained on domain-specific medical data or those pretrained on large general datasets. In this study, we systematically evaluate three pretrained CNN architectures for brain tumor classification: RadImageNet DenseNet121 with medical-domain pretraining, EfficientNetV2S, and ConvNeXt-Tiny, which are modern general-purpose CNNs. All models were trained and fine-tuned under identical conditions using a limited-size brain MRI dataset to ensure a fair comparison. Our results reveal that ConvNeXt-Tiny achieved the highest accuracy, followed by EfficientNetV2S, while RadImageNet DenseNet121, despite being pretrained on domain-specific medical data, exhibited poor generalization with lower accuracy and higher loss. These findings suggest that domain-specific pretraining may not generalize well under small-data conditions. In contrast, modern, deeper general-purpose CNNs pretrained on large-scale datasets can offer superior transfer learning performance in specialized medical imaging tasks.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
- Europe > Czechia > Prague (0.04)
- Asia > Middle East > Jordan (0.04)
- Africa > Cameroon > Gulf of Guinea (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Explainable Deep Learning in Medical Imaging: Brain Tumor and Pneumonia Detection
Erukude, Sai Teja, Marella, Viswa Chaitanya, Veluru, Suhasnadh Reddy
Deep Learning (DL) holds enormous potential for improving medical imaging diagnostics, yet the lack of interpretability in most models hampers clinical trust and adoption. This paper presents an explainable deep learning framework for detecting brain tumors in MRI scans and pneumonia in chest X-ray images using two leading Convolutional Neural Networks, ResNet50 and DenseNet121. These models were trained on publicly available Kaggle datasets comprising 7,023 brain MRI images and 5,863 chest X-ray images, achieving high classification performance. DenseNet121 consistently outperformed ResNet50 with 94.3 percent vs. 92.5 percent accuracy for brain tumors and 89.1 percent vs. 84.4 percent accuracy for pneumonia. For better explainability, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated to create heatmap visualizations superimposed on the test images, indicating the most influential image regions in the decision-making process. Interestingly, while both models produced accurate results, Grad-CAM showed that DenseNet121 consistently focused on core pathological regions, whereas ResNet50 sometimes scattered attention to peripheral or non-pathological areas. Combining deep learning and explainable AI offers a promising path toward reliable, interpretable, and clinically useful diagnostic tools.
- North America > United States > Kansas (0.05)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
FOSSIL: Regret-Minimizing Curriculum Learning for Metadata-Free and Low-Data Mpox Diagnosis
Han, Sahng-Min, Kim, Minjae, Cha, Jinho, Choe, Se-woon, Cha, Eunchan Daniel, Choi, Jungwon, Jung, Kyudong
Deep learning in small and imbalanced biomedical datasets remains fundamentally constrained by unstable optimization and poor generalization. We present the first biomedical implementation of FOSSIL (Flexible Optimization via Sample-Sensitive Importance Learning), a regret-minimizing weighting framework that adaptively balances training emphasis according to sample difficulty. Using softmax-based uncertainty as a continuous measure of difficulty, we construct a four-stage curriculum (Easy-Very Hard) and integrate FOSSIL into both convolutional and transformer-based architectures for Mpox skin lesion diagnosis. Across all settings, FOSSIL substantially improves discrimination (AUC = 0.9573), calibration (ECE = 0.053), and robustness under real-world perturbations, outperforming conventional baselines without metadata, manual curation, or synthetic augmentation. The results position FOSSIL as a generalizable, data-efficient, and interpretable framework for difficulty-aware learning in medical imaging under data scarcity.
- North America > United States > Kentucky > Fayette County > Lexington (0.14)
- Asia > South Korea (0.04)
- Asia > Bangladesh (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
R-Net: A Reliable and Resource-Efficient CNN for Colorectal Cancer Detection with XAI Integration
Ayon, Rokonozzaman, Ahad, Md Taimur, Song, Bo, Li, Yan
State - of - the - art (SOTA) Convolutional Neural Networks (CNNs) are criticized for their extensive computational power, long training times, and large datasets . To overcome this limitation, we propose a reasonable network (R - Net), a lightweight CNN only to detect and classify colorectal cancer (CRC) using the Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset (EBHI) . Furthermore, six SOTA CNNs, including Multipath - based CNNs (DenseNet121, ResNet50), Depth - based CNNs (InceptionV3), width - based multi - connection CNNs (Xception), depth - wise separable convolutions (MobileNetV2), spatial exploitation - based CNNs (VGG16), Transfer learning, and two ensemble models are also tested on the same dataset . The ensemble models are a multipath - depth - width combination (DenseNet121 - InceptionV3 - Xception) and a multipath - depth - spatial combination ( ResNet18 - InceptionV3 - VGG16) . However, the proposed R - Net lightweight achieved 99.37% accuracy, outperforming MobileNet ( 95.83%) and ResNet50 ( 96.94%). Most importantly, to understand the decision - making of R - Net, Explainable AI such as SHAP, LIME, and Grad - CAM are integrated to visualize which parts of the EBHI image contribute to the detection and classification process of R - Net . The main novelty of this research lies in building a reliable, lightweight CNN R - Net that requires fewer computing resources yet maintains strong prediction results. SOTA CNN s, transfer learning, and ensemble models also extend our knowledge on CRC classification and detection. XAI functionality and the impact of pixel intensity on correct and incorrect classification images are also some novelties in CRC detection and classification.
- Oceania > Australia > Queensland (0.04)
- Asia > Singapore (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (1.00)
- Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Winsor-CAM: Human-Tunable Visual Explanations from Deep Networks via Layer-Wise Winsorization
Wall, Casey, Wang, Longwei, Rizk, Rodrigue, Santosh, KC
Interpreting the decision-making process of Convolutional Neural Networks (CNNs) is critical for deploying models in high-stakes domains. Gradient-weighted Class Activation Mapping (Grad-CAM) is a widely used method for visual explanations, yet it typically focuses on the final convolutional layer or naïvely averages across layers, strategies that can obscure important semantic cues or amplify irrelevant noise. We propose Winsor-CAM, a novel, human-tunable extension of Grad-CAM that generates robust and coherent saliency maps by aggregating information across all convolutional layers. To mitigate the influence of noisy or extreme attribution values, Winsor-CAM applies Winsorization, a percentile-based outlier attenuation technique. A user-controllable threshold allows for semantic-level tuning, enabling flexible exploration of model behavior across representational hierarchies. Evaluations on standard architectures (ResNet50, DenseNet121, VGG16, InceptionV3) using the PASCAL VOC 2012 dataset demonstrate that Winsor-CAM produces more interpretable heatmaps and achieves superior performance in localization metrics, including intersection-over-union and center-of-mass alignment, when compared to Grad-CAM and uniform layer-averaging baselines. Winsor-CAM advances the goal of trustworthy AI by offering interpretable, multi-layer insights with human-in-the-loop control.
Local Herb Identification Using Transfer Learning: A CNN-Powered Mobile Application for Nepalese Flora
Thapa, Prajwal, Sharma, Mridul, Nyachhyon, Jinu, Pandeya, Yagya Raj
Herb classification presents a critical challenge in botanical research, particularly in regions with rich biodiversity such as Nepal. This study introduces a novel deep learning approach for classifying 60 different herb species using Convolutional Neural Networks (CNNs) and transfer learning techniques. Using a manually curated dataset of 12,000 herb images, we developed a robust machine learning model that addresses existing limitations in herb recognition methodologies. Our research employed multiple model architectures, including DenseNet121, 50-layer Residual Network (ResNet50), 16-layer Visual Geometry Group Network (VGG16), InceptionV3, EfficientNetV2, and Vision Transformer (VIT), with DenseNet121 ultimately demonstrating superior performance. Data augmentation and regularization techniques were applied to mitigate overfitting and enhance the generalizability of the model. This work advances herb classification techniques, preserving traditional botanical knowledge and promoting sustainable herb utilization.
Time Frequency Analysis of EMG Signal for Gesture Recognition using Fine grained Features
Aarotale, Parshuram N., Rattani, Ajita
Electromyography (EMG) based hand gesture recognition converts forearm muscle activity into control commands for prosthetics, rehabilitation, and human computer interaction. This paper proposes a novel approach to EMG-based hand gesture recognition that uses fine-grained classification and presents XMANet, which unifies low-level local and high level semantic cues through cross layer mutual attention among shallow to deep CNN experts. Using stacked spectrograms and scalograms derived from the Short Time Fourier Transform (STFT) and Wavelet Transform (WT), we benchmark XMANet against ResNet50, DenseNet-121, MobileNetV3, and EfficientNetB0. Experimental results on the Grabmyo dataset indicate that, using STFT, the proposed XMANet model outperforms the baseline ResNet50, EfficientNetB0, MobileNetV3, and DenseNet121 models with improvement of approximately 1.72%, 4.38%, 5.10%, and 2.53%, respectively. When employing the WT approach, improvements of around 1.57%, 1.88%, 1.46%, and 2.05% are observed over the same baselines. Similarly, on the FORS EMG dataset, the XMANet(ResNet50) model using STFT shows an improvement of about 5.04% over the baseline ResNet50. In comparison, the XMANet(DenseNet121) and XMANet(MobileNetV3) models yield enhancements of approximately 4.11% and 2.81%, respectively. Moreover, when using WT, the proposed XMANet achieves gains of around 4.26%, 9.36%, 5.72%, and 6.09% over the baseline ResNet50, DenseNet121, MobileNetV3, and EfficientNetB0 models, respectively. These results confirm that XMANet consistently improves performance across various architectures and signal processing techniques, demonstrating the strong potential of fine grained features for accurate and robust EMG classification.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Kansas (0.04)
- Europe > Italy (0.04)