cough
Language Models as Semantic Teachers: Post-Training Alignment for Medical Audio Understanding
Wang, Tsai-Ning, Chen, Lin-Lin, Zeghidour, Neil, Saeed, Aaqib
Pre-trained audio models excel at detecting acoustic patterns in auscultation sounds but often fail to grasp their clinical significance, limiting their use and performance in diagnostic tasks. To bridge this gap, we introduce AcuLa (Audio-Clinical Understanding via Language Alignment), a lightweight post-training framework that instills semantic understanding into any audio encoder by aligning it with a medical language model, which acts as a "semantic teacher." To enable alignment at scale, we construct a large-scale dataset by leveraging off-the-shelf large language models to translate the rich, structured metadata accompanying existing audio recordings into coherent clinical reports. Our alignment strategy combines a representation-level contrastive objective with a self-supervised modeling, ensuring that the model learns clinical semantics while preserving fine-grained temporal cues. AcuLa achieves state-of-the-art results across 18 diverse cardio-respiratory tasks from 10 different datasets, improving the mean AUROC on classification benchmarks from 0.68 to 0.79 and, on the most challenging COVID-19 cough detection task, boosting the AUROC from 0.55 to 0.89. Our work demonstrates that this audio-language alignment transforms purely acoustic models into clinically-aware diagnostic tools, establishing a novel paradigm for enhancing physiological understanding in audio-based health monitoring.
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
- Health & Medicine > Consumer Health (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.49)
- Health & Medicine > Therapeutic Area > Immunology (0.34)
- Asia > Taiwan (0.05)
- Europe > Portugal > Aveiro > Aveiro (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- (7 more...)
- Asia > Taiwan (0.05)
- Europe > Portugal > Aveiro > Aveiro (0.04)
- Europe > Greece > Central Macedonia > Thessaloniki (0.04)
- (7 more...)
Cough Classification using Few-Shot Learning
Kumar, Yoga Disha Sendhil, Shetty, Manas V, Vhaduri, Sudip
This paper investigates the effectiveness of few-shot learning for respiratory sound classification, focusing on coughbased detection of COVID-19, Flu, and healthy conditions. We leverage Prototypical Networks with spectrogram representations of cough sounds to address the challenge of limited labeled data. Our study evaluates whether few-shot learning can enable models to achieve performance comparable to traditional deep learning approaches while using significantly fewer training samples. Additionally, we compare multi-class and binary classification models to assess whether multi-class models can perform comparably to their binary counterparts. Experimental findings show that few-shot learning models can achieve competitive accuracy. Our model attains 74.87% accuracy in multi-class classification with only 15 support examples per class, while binary classification achieves over 70% accuracy across all class pairs. Class-wise analysis reveals Flu as the most distinguishable class, and Healthy as the most challenging. Statistical tests (paired t-test p = 0.149, Wilcoxon p = 0.125) indicate no significant performance difference between binary and multiclass models, supporting the viability of multi-class classification in this setting. These results highlight the feasibility of applying few-shot learning in medical diagnostics, particularly when large labeled datasets are unavailable.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Sound Signal Synthesis with Auxiliary Classifier GAN, COVID-19 cough as an example
Saleh, Yahya Sherif Solayman Mohamed, Dabbous, Ahmed Mohammed, Alkhaled, Lama, Chai, Hum Yan, Rana, Muhammad Ehsan, Mokayed, Hamam
One of the fastest-growing domains in AI is healthcare. Given its importance, it has been the interest of many researchers to deploy ML models into the ever-demanding healthcare domain to aid doctors and increase accessibility. Delivering reliable models, however, demands a sizable amount of data, and the recent COVID-19 pandemic served as a reminder of the rampant and scary nature of healthcare that makes training models difficult. To alleviate such scarcity, many published works attempted to synthesize radiological cough data to train better COVID-19 detection models on the respective radiological data. To accommodate the time sensitivity expected during a pandemic, this work focuses on detecting COVID-19 through coughs using synthetic data to improve the accuracy of the classifier. The work begins by training a CNN on a balanced subset of the Coughvid dataset, establishing a baseline classification test accuracy of 72%. The paper demonstrates how an Auxiliary Classification GAN (ACGAN) may be trained to conditionally generate novel synthetic Mel Spectrograms of both healthy and COVID-19 coughs. These coughs are used to augment the training dataset of the CNN classifier, allowing it to reach a new test accuracy of 75%. The work highlights the expected messiness and inconsistency in training and offers insights into detecting and handling such shortcomings.
- Europe > Sweden > Norrbotten County > Luleå (0.04)
- Europe > Czechia > South Moravian Region > Brno (0.04)
- Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Automatic Cough Analysis for Non-Small Cell Lung Cancer Detection
Giangregorio, Chiara, Licciardello, Cristina Maria, Miskovic, Vanja, Provenzano, Leonardo, Pedrocchi, Alessandra Laura Giulia, Dumitrascu, Andra Diana, Prelaj, Arsela, Garassino, Marina Chiara, Ambrosini, Emilia, Ferrante, Simona
Early detection of non-small cell lung cancer (NSCLC) is critical for improving patient outcomes, and novel approaches are needed to facilitate early diagnosis. In this study, we explore the use of automatic cough analysis as a pre-screening tool for distinguishing between NSCLC patients and healthy controls. Cough audio recordings were prospectively acquired from a total of 227 subjects, divided into NSCLC patients and healthy controls. The recordings were analyzed using machine learning techniques, such as support vector machine (SVM) and XGBoost, as well as deep learning approaches, specifically convolutional neural networks (CNN) and transfer learning with VGG16. To enhance the interpretability of the machine learning model, we utilized Shapley Additive Explanations (SHAP). The fairness of the models across demographic groups was assessed by comparing the performance of the best model across different age groups (less than or equal to 58y and higher than 58y) and gender using the equalized odds difference on the test set. The results demonstrate that CNN achieves the best performance, with an accuracy of 0.83 on the test set. Nevertheless, SVM achieves slightly lower performances (accuracy of 0.76 in validation and 0.78 in the test set), making it suitable in contexts with low computational power. The use of SHAP for SVM interpretation further enhances model transparency, making it more trustworthy for clinical applications. Fairness analysis shows slightly higher disparity across age (0.15) than gender (0.09) on the test set. Therefore, to strengthen our findings' reliability, a larger, more diverse, and unbiased dataset is needed -- particularly including individuals at risk of NSCLC and those in early disease stages.
- Europe > Italy > Lombardy > Milan (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)
SynSUM -- Synthetic Benchmark with Structured and Unstructured Medical Records
Rabaey, Paloma, Arno, Henri, Heytens, Stefan, Demeester, Thomas
We present the SynSUM benchmark, a synthetic dataset linking unstructured clinical notes to structured background variables. The dataset consists of 10,000 artificial patient records containing tabular variables (like symptoms, diagnoses and underlying conditions) and related notes describing the fictional patient encounter in the domain of respiratory diseases. The tabular portion of the data is generated through a Bayesian network, where both the causal structure between the variables and the conditional probabilities are proposed by an expert based on domain knowledge. We then prompt a large language model (GPT-4o) to generate a clinical note related to this patient encounter, describing the patient symptoms and additional context. The SynSUM dataset is primarily designed to facilitate research on clinical information extraction in the presence of tabular background variables, which can be linked through domain knowledge to concepts of interest to be extracted from the text - the symptoms, in the case of SynSUM. Secondary uses include research on the automation of clinical reasoning over both tabular data and text, causal effect estimation in the presence of tabular and/or textual confounders, and multi-modal synthetic data generation. The dataset can be downloaded from https://github.com/prabaey/SynSUM.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
- (2 more...)
AI can spot tuberculosis early by listening to your cough
The same underlying technology powering massively popular generative AI models like from large tech firms like OpenAI is now being used to scan for early signs of lung disease. Google, one of the leaders in new AI models, is partnering with a healthcare startup that's analyzing vast datasets of coughs and sneezes to detect signs of tuberculous or other respiratory diseases before they get worse. It's one of numerous ways the quickly evolving technology is rapidly reshaping early detection of disease across the healthcare industry. What happens once that initial diagnosis is made, however, still requires quintessential human clinical expertise. Earlier this year, Google released details about a new healthcare self-supervised, deep-learning model they dubbed Health Acoustics Representation (HeAR).
The Morning After: Reddit is blocking AI search engines that don't cough up for access
When Reddit said last month it would block unauthorized data scraping from its site, most of us assumed it was to tackle chatbot training. It turns out the site/service/fandom battleground also appears to be blocking search engines other than Brave and Google, the latter of which reportedly inked a deal earlier this year with Reddit worth 60 million annually. A Reddit spokesperson told Engadget the empty search results are because these engines won't agree to the company's requirements for AI training. The company says it's in discussions with several of them. Bing and DuckDuckGo both appear to be affected.
- Media > News (1.00)
- Leisure & Entertainment > Games > Computer Games (0.35)
AI companies are finally being forced to cough up for training data
AI companies have pillaged the internet for training data, and many websites and data set owners have started restricting the ability to scrape their websites. We've also seen a backlash against the AI sector's practice of indiscriminately scraping online data, in the form of users opting out of making their data available for training and lawsuits from artists, writers, and the New York Times, claiming that AI companies have taken their intellectual property without consent or compensation. My colleague James O'Donnell dissects the lawsuits in his story and points out that these lawsuits could determine the future of AI music. But this moment also sets an interesting precedent for all of generative AI development. Thanks to the scarcity of high-quality data and the immense pressure and demand to build even bigger and better models, we're in a rare moment where data owners actually have some leverage.
- Law > Intellectual Property & Technology Law (0.54)
- Media > Music (0.34)