Goto

Collaborating Authors

 mixstyle


iMedic: Towards Smartphone-based Self-Auscultation Tool for AI-Powered Pediatric Respiratory Assessment

Jeong, Seung Gyu, Nam, Sung Woo, Jung, Seong Kwan, Kim, Seong-Eun

arXiv.org Artificial Intelligence

Respiratory auscultation is crucial for early detection of pediatric pneumonia, a condition that can quickly worsen without timely intervention. In areas with limited physician access, effective auscultation is challenging. We present a smartphone-based system that leverages built-in microphones and advanced deep learning algorithms to detect abnormal respiratory sounds indicative of pneumonia risk. Our end-to-end deep learning framework employs domain generalization to integrate a large electronic stethoscope dataset with a smaller smartphone-derived dataset, enabling robust feature learning for accurate respiratory assessments without expensive equipment. The accompanying mobile application guides caregivers in collecting high-quality lung sound samples and provides immediate feedback on potential pneumonia risks. User studies show strong classification performance and high acceptance, demonstrating the system's ability to facilitate proactive interventions and reduce preventable childhood pneumonia deaths. By seamlessly integrating into ubiquitous smartphones, this approach offers a promising avenue for more equitable and comprehensive remote pediatric care.


GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion

Liu, Xueyi, Yi, Li

arXiv.org Artificial Intelligence

In this work, we tackle the challenging problem of denoising hand-object interactions (HOI). Given an erroneous interaction sequence, the objective is to refine the incorrect hand trajectory to remove interaction artifacts for a perceptually realistic sequence. This challenge involves intricate interaction noise, including unnatural hand poses and incorrect hand-object relations, alongside the necessity for robust generalization to new interactions and diverse noise patterns. We tackle those challenges through a novel approach, GeneOH Diffusion, incorporating two key designs: an innovative contact-centric HOI representation named GeneOH and a new domain-generalizable denoising scheme. The contact-centric representation GeneOH informatively parameterizes the HOI process, facilitating enhanced generalization across various HOI scenarios. The new denoising scheme consists of a canonical denoising model trained to project noisy data samples from a whitened noise space to a clean data manifold and a "denoising via diffusion" strategy which can handle input trajectories with various noise patterns by first diffusing them to align with the whitened noise space and cleaning via the canonical denoiser. Extensive experiments on four benchmarks with significant domain variations demonstrate the superior effectiveness of our method. GeneOH Diffusion also shows promise for various downstream applications. Project website: https://meowuu7.github.io/GeneOH-Diffusion/.


Test-Time Style Shifting: Handling Arbitrary Styles in Domain Generalization

Park, Jungwuk, Han, Dong-Jun, Kim, Soyeong, Moon, Jaekyun

arXiv.org Artificial Intelligence

In domain generalization (DG), the target domain is unknown when the model is being trained, and the trained model should successfully work on an arbitrary (and possibly unseen) target domain during inference. This is a difficult problem, and despite active studies in recent years, it remains a great challenge. In this paper, we take a simple yet effective approach to tackle this issue. We propose test-time style shifting, which shifts the style of the test sample (that has a large style gap with the source domains) to the nearest source domain that the model is already familiar with, before making the prediction. This strategy enables the model to handle any target domains with arbitrary style statistics, without additional model update at test-time. Additionally, we propose style balancing, which provides a great platform for maximizing the advantage of test-time style shifting by handling the DG-specific imbalance issues. The proposed ideas are easy to implement and successfully work in conjunction with various other DG schemes. Experimental results on different datasets show the effectiveness of our methods.


Calibration-Free Driver Drowsiness Classification based on Manifold-Level Augmentation

Kim, Dong-Young, Han, Dong-Kyun, Shin, Hye-Bin

arXiv.org Artificial Intelligence

Drowsiness reduces concentration and increases response time, which causes fatal road accidents. Monitoring drivers' drowsiness levels by electroencephalogram (EEG) and taking action may prevent road accidents. EEG signals effectively monitor the driver's mental state as they can monitor brain dynamics. However, calibration is required in advance because EEG signals vary between and within subjects. Because of the inconvenience, calibration has reduced the accessibility of the brain-computer interface (BCI). Developing a generalized classification model is similar to domain generalization, which overcomes the domain shift problem. Especially data augmentation is frequently used. This paper proposes a calibration-free framework for driver drowsiness state classification using manifold-level augmentation. This framework increases the diversity of source domains by utilizing features. We experimented with various augmentation methods to improve the generalization performance. Based on the results of the experiments, we found that deeper models with smaller kernel sizes improved generalizability. In addition, applying an augmentation at the manifold-level resulted in an outstanding improvement. The framework demonstrated the capability for calibration-free BCI.


MixStyle Neural Networks for Domain Generalization and Adaptation

Zhou, Kaiyang, Yang, Yongxin, Qiao, Yu, Xiang, Tao

arXiv.org Artificial Intelligence

Convolutional neural networks (CNNs) often have poor generalization performance under domain shift. One way to improve domain generalization is to collect diverse source data from multiple relevant domains so that a CNN model is allowed to learn more domain-invariant, and hence generalizable representations. In this work, we address domain generalization with MixStyle, a plug-and-play, parameter-free module that is simply inserted to shallow CNN layers and requires no modification to training objectives. Specifically, MixStyle probabilistically mixes feature statistics between instances. This idea is inspired by the observation that visual domains can often be characterized by image styles which are in turn encapsulated within instance-level feature statistics in shallow CNN layers. Therefore, inserting MixStyle modules in effect synthesizes novel domains albeit in an implicit way. MixStyle is not only simple and flexible, but also versatile -- it can be used for problems whereby unlabeled images are available, such as semi-supervised domain generalization and unsupervised domain adaptation, with a simple extension to mix feature statistics between labeled and pseudo-labeled instances. We demonstrate through extensive experiments that MixStyle can significantly boost the out-of-distribution generalization performance across a wide range of tasks including object recognition, instance retrieval, and reinforcement learning.