Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

Bae, Sangmin, Kim, June-Woo, Cho, Won-Yang, Baek, Hyerim, Son, Soyoun, Lee, Byungjo, Ha, Changwan, Tae, Kyongpil, Kim, Sungnyun, Yun, Se-Young

Nov-22-2023–arXiv.org Artificial Intelligence

Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study, we demonstrate that the pretrained model on large-scale visual and audio datasets can be generalized to the respiratory sound classification task. In addition, we introduce a straightforward Patch-Mix augmentation, which randomly mixes patches between different samples, with Audio Spectrogram Transformer (AST). We further propose a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space. Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.

classification, dataset, representation, (11 more...)

arXiv.org Artificial Intelligence

Nov-22-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > Greece
  - Central Macedonia > Thessaloniki (0.04)
- Asia > Middle East
  - Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:
- Research Report (0.84)

Industry:
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.87)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)