A Multi-Stage Hybrid CNN-Transformer Network for Automated Pediatric Lung Sound Classification
Shuvo, Samiul Based, Hasan, Taufiq
–arXiv.org Artificial Intelligence
Abstract--Background: Automated analysis of lung sound auscultation is essential for monitoring respiratory health, particularly in regions with a shortage of skilled healthcare workers. Although respiratory sound classification has been widely studied in adults, its application in pediatric populations, especially in children under six years of age remains underexplored. Developmental changes in pediatric lungs substantially modify the acoustic properties of respiratory sounds, requiring classification approaches tailored specifically to this age group. Methods: T o address this challenge, we propose a multistage hybrid CNN-Transformer framework that integrates CNN-extracted features with an attention-based architecture for pediatric respiratory disease classification. Scalogram images were generated from both full recordings and individual breath events to capture multi-resolution representations of respiratory sounds. T o mitigate class imbalance, class-wise focal loss was applied during model training. Results: The proposed model achieved an overall score of 0.9039 in binary event classification At the recording level, the model obtained scores of 0.720 for ternary classification and 0.571 for multiclass classification. These results outperform the previous best-performing models by 3.81% and 5.94%, respectively. Conclusion: Our findings demonstrate that the proposed hybrid CNN-Transformer framework effectively captures the unique acoustic features of pediatric lung sounds.
arXiv.org Artificial Intelligence
Oct-21-2025
- Country:
- Africa > Cameroon
- North-West Region > Bamenda (0.04)
- Asia
- Bangladesh > Dhaka Division
- Dhaka District > Dhaka (0.04)
- China > Shanghai
- Shanghai (0.04)
- Bangladesh > Dhaka Division
- Africa > Cameroon
- Genre:
- Research Report > New Finding (0.86)
- Industry:
- Technology: