Goto

Collaborating Authors

 conditional generative adversarial network


Conditional Generative Adversarial Networks Based Inertial Signal Translation

Kolakowski, Marcin

arXiv.org Artificial Intelligence

The paper presents an approach in which inertial signals measured with a wrist-worn sensor (e.g., a smartwatch) are translated into those that would be recorded using a shoe-mounted sensor, enabling the use of state-of-the-art gait analysis methods. In the study, the signals are translated using Conditional Generative Adversarial Networks (GANs). Two different GAN versions are used for experimental verification: traditional ones trained using binary cross-entropy loss and Wasserstein GANs (WGANs). For the generator, two architectures, a convolutional autoencoder, and a convolutional U-Net, are tested. The experiment results have shown that the proposed approach allows for an accurate translation, enabling the use of wrist sensor inertial signals for efficient, every-day gait analysis.


Generative Artificial Intelligence in Medical Imaging: Foundations, Progress, and Clinical Translation

Zhou, Xuanru, Li, Cheng, Wang, Shuqiang, Li, Ye, Tan, Tao, Zheng, Hairong, Wang, Shanshan

arXiv.org Artificial Intelligence

Generative artificial intelligence (AI) is rapidly transforming medical imaging by enabling capabilities such as data synthesis, image enhancement, modality translation, and spatiotemporal modeling. This review presents a comprehensive and forward-looking synthesis of recent advances in generative modeling including generative adversarial networks (GANs), variational autoencoders (VAEs), diffusion models, and emerging multimodal foundation architectures and evaluates their expanding roles across the clinical imaging continuum. We systematically examine how generative AI contributes to key stages of the imaging workflow, from acquisition and reconstruction to cross-modality synthesis, diagnostic support, and treatment planning. Emphasis is placed on both retrospective and prospective clinical scenarios, where generative models help address longstanding challenges such as data scarcity, standardization, and integration across modalities. To promote rigorous benchmarking and translational readiness, we propose a three-tiered evaluation framework encompassing pixel-level fidelity, feature-level realism, and task-level clinical relevance. We also identify critical obstacles to real-world deployment, including generalization under domain shift, hallucination risk, data privacy concerns, and regulatory hurdles. Finally, we explore the convergence of generative AI with large-scale foundation models, highlighting how this synergy may enable the next generation of scalable, reliable, and clinically integrated imaging systems. By charting technical progress and translational pathways, this review aims to guide future research and foster interdisciplinary collaboration at the intersection of AI, medicine, and biomedical engineering.


Emotion Detection Using Conditional Generative Adversarial Networks (cGAN): A Deep Learning Approach

Srivastava, Anushka

arXiv.org Artificial Intelligence

--Emotion recognition is a key task in affective computing with applications in healthcare, human-computer interaction, and surveillance systems. This study proposes a Conditional Generative Adversarial Network (cGAN)-based approach to generate synthetic emotion-specific facial images to augment training data and mitigate class imbalance. The generator learns to synthesize grayscale 64 64 facial images conditioned on emotion labels, while the discriminator distinguishes between real and generated images using label conditioning. The model was trained on the FER-2013 dataset and evaluated over 300 epochs. Training results demonstrate stable adversarial loss convergence, indicating effective learning and generation capability.


Learning from Limited and Imperfect Data

Rangwani, Harsh

arXiv.org Artificial Intelligence

The distribution of data in the world (eg, internet, etc.) significantly differs from the well-curated datasets and is often over-populated with samples from common categories. The algorithms designed for well-curated datasets perform suboptimally when used for learning from imperfect datasets with long-tailed imbalances and distribution shifts. To expand the use of deep models, it is essential to overcome the labor-intensive curation process by developing robust algorithms that can learn from diverse, real-world data distributions. Toward this goal, we develop practical algorithms for Deep Neural Networks which can learn from limited and imperfect data present in the real world. This thesis is divided into four segments, each covering a scenario of learning from limited or imperfect data. The first part of the thesis focuses on Learning Generative Models from Long-Tail Data, where we mitigate the mode-collapse and enable diverse aesthetic image generations for tail (minority) classes. In the second part, we enable effective generalization on tail classes through Inductive Regularization schemes, which allow tail classes to generalize as effectively as the head classes without requiring explicit generation of images. In the third part, we develop algorithms for Optimizing Relevant Metrics for learning from long-tailed data with limited annotation (semi-supervised), followed by the fourth part, which focuses on the Efficient Domain Adaptation of the model to various domains with very few to zero labeled samples.


Hybrid Adversarial Spectral Loss Conditional Generative Adversarial Networks for Signal Data Augmentation in Ultra-precision Machining Surface Roughness Prediction

Shang, Suiyan, Cheung, Chi Fai, Zheng, Pai

arXiv.org Artificial Intelligence

Accurate surface roughness prediction in ultra-precision machining (UPM) is critical for real-time quality control, but small datasets hinder model performance. We propose HAS-CGAN, a Hybrid Adversarial Spectral Loss CGAN, for effective UPM data augmentation. Among five CGAN variants tested, HAS-CGAN excels in 1D force signal generation, particularly for high-frequency signals, achieving >0.85 wavelet coherence through Fourier-domain optimization. By combining generated signals with machining parameters, prediction accuracy significantly improves. Experiments with traditional ML (SVR, RF, LSTM) and deep learning models (BPNN, 1DCNN, CNN-Transformer) demonstrate that augmenting training data with 520+ synthetic samples reduces prediction error from 31.4% (original 52 samples) to ~9%, effectively addressing data scarcity in UPM roughness prediction."


RainScaleGAN: a Conditional Generative Adversarial Network for Rainfall Downscaling

Iotti, Marcello, Davini, Paolo, von Hardenberg, Jost, Zappa, Giuseppe

arXiv.org Artificial Intelligence

To this day, accurately simulating local-scale precipitation and reliably reproducing its distribution remains a challenging task. The limited horizontal resolution of Global Climate Models is among the primary factors undermining their skill in this context. The physical mechanisms driving the onset and development of precipitation, especially in extreme events, operate at spatio-temporal scales smaller than those numerically resolved, thus struggling to be captured accurately. In order to circumvent this limitation, several downscaling approaches have been developed over the last decades to address the discrepancy between the spatial resolution of models output and the resolution required by local-scale applications. In this paper, we introduce RainScaleGAN, a conditional deep convolutional Generative Adversarial Network (GAN) for precipitation downscaling. GANs have been effectively used in image super-resolution, an approach highly relevant for downscaling tasks. RainScaleGAN's capabilities are tested in a perfect-model setup, where the spatial resolution of a precipitation dataset is artificially degraded from 0.25$^{\circ}\times$0.25$^{\circ}$ to 2$^{\circ}\times$2$^\circ$, and RainScaleGAN is used to restore it. The developed model outperforms one of the leading precipitation downscaling method found in the literature. RainScaleGAN not only generates a synthetic dataset featuring plausible high-resolution spatial patterns and intensities, but also produces a precipitation distribution with statistics closely mirroring those of the ground-truth dataset. Given that RainScaleGAN's approach is agnostic with respect to the underlying physics, the method has the potential to be applied to other physical variables such as surface winds or temperature.


Determination of galaxy photometric redshifts using Conditional Generative Adversarial Networks (CGANs)

Garcia-Fernandez, M.

arXiv.org Artificial Intelligence

Accurate and reliable photometric redshifts determination is one of the key aspects for wide-field photometric surveys. Determination of photometric redshift for galaxies, has been traditionally solved by use of machine-learning and artificial intelligence techniques trained on a calibration sample of galaxies, where both photometry and spectrometry are determined. On this paper, we present a new algorithmic approach for determining photometric redshifts of galaxies using Conditional Generative Adversarial Networks (CGANs). Proposed CGAN implementation, approaches photometric redshift determination as a probabilistic regression, where instead of determining a single value for the estimated redshift of the galaxy, a full probability density is computed. The methodology proposed, is tested with data from Dark Energy Survey (DES) Y1 data and compared with other existing algorithm such as a Random Forest regressor.


Applying Conditional Generative Adversarial Networks for Imaging Diagnosis

Yang, Haowei, Hu, Yuxiang, He, Shuyao, Xu, Ting, Yuan, Jiajie, Gu, Xingxin

arXiv.org Artificial Intelligence

This study introduces an innovative application of Conditional Generative Adversarial Networks (C-GAN) integrated with Stacked Hourglass Networks (SHGN) aimed at enhancing image segmentation, particularly in the challenging environment of medical imaging. We address the problem of overfitting, common in deep learning models applied to complex imaging datasets, by augmenting data through rotation and scaling. A hybrid loss function combining L1 and L2 reconstruction losses, enriched with adversarial training, is introduced to refine segmentation processes in intravascular ultrasound (IVUS) imaging. Our approach is unique in its capacity to accurately delineate distinct regions within medical images, such as tissue boundaries and vascular structures, without extensive reliance on domain-specific knowledge. The algorithm was evaluated using a standard medical image library, showing superior performance metrics compared to existing methods, thereby demonstrating its potential in enhancing automated medical diagnostics through deep learning


(Un)paired signal-to-signal translation with 1D conditional GANs

Easthope, Eric

arXiv.org Artificial Intelligence

The past few years have seen a significant rise in research and public interest in the use of generative machine learning and artificial intelligence (ML/AI) models for image-to-image translation tasks. Perhaps one of the more recognizable models is pix2pix [3], a deep generative model (DGM) and particularly a deep convolutional generative adversarial network (DCGAN) [2, 7] that is capable of translating between pairs of high-resolution images within a learned image data domain. The novelty of pix2pix laid in its model architecture which combined a deep U-Net generator that learns to generate mock data samples with a convolutional PatchGAN discriminator that learns to label regions, "patches," of inputs as "real" (sampled data) or "fake" (generated data). Much of the research interest in pix2pix has centred on image translation tasks but the inherent structure of the U-Net model does not limit it to images alone. In fact original developments of U-Net were for semantic segmentation [5]. Research into GANs as they stand within the wider DGM and even wider generative ML/AI ecosystem have not been limited to images either. Parallel work on one-dimensional (1D) GANs where time series training data is periodic [1] has observed that derived models that decompose demonstrated two-dimensional models into 1D counterparts with a wider learning aperture, which we set ourselves with the size of convolution kernels, are capable of generating convincing high-accuracy 1D time series (including audio) from a learned signal data domain. Wider convolutional apertures are necessary for models to see and learn the time series periodicity. Others before have taken the conceptual essence of signal-to-signal translation and adapted its generator U-Net models for other signal domains; spectrum translation [6] (spectral/frequency series-to-series), sensor translation [4] (time series-to-series, 2D), and sound translation [9] (time series-to-series, 1D) to name a few.


Turning Waste into Wealth: Leveraging Low-Quality Samples for Enhancing Continuous Conditional Generative Adversarial Networks

Ding, Xin, Wang, Yongwei, Xu, Zuheng

arXiv.org Artificial Intelligence

Continuous Conditional Generative Adversarial Networks (CcGANs) enable generative modeling conditional on continuous scalar variables (termed regression labels). However, they can produce subpar fake images due to limited training data. Although Negative Data Augmentation (NDA) effectively enhances unconditional and class-conditional GANs by introducing anomalies into real training images, guiding the GANs away from low-quality outputs, its impact on CcGANs is limited, as it fails to replicate negative samples that may occur during the CcGAN sampling. We present a novel NDA approach called Dual-NDA specifically tailored for CcGANs to address this problem. Dual-NDA employs two types of negative samples: visually unrealistic images generated from a pre-trained CcGAN and label-inconsistent images created by manipulating real images' labels. Leveraging these negative samples, we introduce a novel discriminator objective alongside a modified CcGAN training algorithm. Empirical analysis on UTKFace and Steering Angle reveals that Dual-NDA consistently enhances the visual fidelity and label consistency of fake images generated by CcGANs, exhibiting a substantial performance gain over the vanilla NDA. Moreover, by applying Dual-NDA, CcGANs demonstrate a remarkable advancement beyond the capabilities of state-of-the-art conditional GANs and diffusion models, establishing a new pinnacle of performance. Our codes can be found at https://github.com/UBCDingXin/Dual-NDA.