Goto

Collaborating Authors

 stargan



Neural Autoregressive Modeling of Brain Aging

Yesiloglu, Ridvan, Peng, Wei, Islam, Md Tauhidul, Adeli, Ehsan

arXiv.org Artificial Intelligence

Brain aging synthesis is a critical task with broad applications in clinical and computational neuroscience. The ability to predict the future structural evolution of a subject's brain from an earlier MRI scan provides valuable insights into aging trajectories. Yet, the high-dimensionality of data, subtle changes of structure across ages, and subject-specific patterns constitute challenges in the synthesis of the aging brain. To overcome these challenges, we propose NeuroAR, a novel brain aging simulation model based on generative autoregressive transformers. NeuroAR synthesizes the aging brain by autoregressively estimating the discrete token maps of a future scan from a convenient space of concatenated token embeddings of a previous and future scan. To guide the generation, it concatenates into each scale the subject's previous scan, and uses its acquisition age and the target age at each block via cross-attention. We evaluate our approach on both the elderly population and adolescent subjects, demonstrating superior performance over state-of-the-art generative models, including latent diffusion models (LDM) and generative adversarial networks, in terms of image fidelity. Furthermore, we employ a pre-trained age predictor to further validate the consistency and realism of the synthesized images with respect to expected aging patterns. NeuroAR significantly outperforms key models, including LDM, demonstrating its ability to model subject-specific brain aging trajectories with high fidelity.


Texture-Aware StarGAN for CT data harmonisation

Di Feola, Francesco, Pompilio, Ludovica, Assolito, Cecilia, Guarrasi, Valerio, Soda, Paolo

arXiv.org Artificial Intelligence

Computed Tomography (CT) plays a pivotal role in medical diagnosis; however, variability across reconstruction kernels hinders data-driven approaches, such as deep learning models, from achieving reliable and generalized performance. To this end, CT data harmonization has emerged as a promising solution to minimize such non-biological variances by standardizing data across different sources or conditions. In this context, Generative Adversarial Networks (GANs) have proved to be a powerful framework for harmonization, framing it as a style-transfer problem. However, GAN-based approaches still face limitations in capturing complex relationships within the images, which are essential for effective harmonization. In this work, we propose a novel texture-aware StarGAN for CT data harmonization, enabling one-to-many translations across different reconstruction kernels. Although the StarGAN model has been successfully applied in other domains, its potential for CT data harmonization remains unexplored. Furthermore, our approach introduces a multi-scale texture loss function that embeds texture information across different spatial and angular scales into the harmonization process, effectively addressing kernel-induced texture variations. We conducted extensive experimentation on a publicly available dataset, utilizing a total of 48667 chest CT slices from 197 patients distributed over three different reconstruction kernels, demonstrating the superiority of our method over the baseline StarGAN.


Reviews: Multi-marginal Wasserstein GAN

Neural Information Processing Systems

Multi Marginal Wasserstein GAN's goal is to match a source domain distribution to multiple target domain distributions. While dedicated GAN frameworks exist, as noted by the authors (CycleGAN, StarGAN, …), their generated samples suffer from blurriness, especially when resulting from multiple targets. Moreover, the main statement of this work is that MWGAN is theoretically motivated, unlike previous works. Computing the multi marginal Wasserstein distance between several domains is intractable in its primal form. Thus, as proposed in WGAN, the authors express their problem in the dual form, resulting in equation 1. Since, the dual formulation is a maximization problem under infinite constraint, which remains intractable, the authors simplify the problem by only considering it on its empirical version (equation 2).


Asymmetric GANs for Image-to-Image Translation

Tang, Hao, Sebe, Nicu

arXiv.org Artificial Intelligence

Existing models for unsupervised image translation with Generative Adversarial Networks (GANs) can learn the mapping from the source domain to the target domain using a cycle-consistency loss. However, these methods always adopt a symmetric network architecture to learn both forward and backward cycles. Because of the task complexity and cycle input difference between the source and target domains, the inequality in bidirectional forward-backward cycle translations is significant and the amount of information between two domains is different. In this paper, we analyze the limitation of existing symmetric GANs in asymmetric translation tasks, and propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy to adapt to the asymmetric need in both unsupervised and supervised image translation tasks. Moreover, the training stage of existing methods has the common problem of model collapse that degrades the quality of the generated images, thus we explore different optimization losses for better training of AsymmetricGAN, making image translation with higher consistency and better stability. Extensive experiments on both supervised and unsupervised generative tasks with 8 datasets show that AsymmetricGAN achieves superior model capacity and better generation performance compared with existing GANs. To the best of our knowledge, we are the first to investigate the asymmetric GAN structure on both unsupervised and supervised image translation tasks.


Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration

Chandaliya, Praveen Kumar, Raja, Kiran, Ramachandra, Raghavendra, Akhtar, Zahid, Busch, Christoph

arXiv.org Artificial Intelligence

Numerous studies have shown that existing Face Recognition Systems (FRS), including commercial ones, often exhibit biases toward certain ethnicities due to under-represented data. In this work, we explore ethnicity alteration and skin tone modification using synthetic face image generation methods to increase the diversity of datasets. We conduct a detailed analysis by first constructing a balanced face image dataset representing three ethnicities: Asian, Black, and Indian. We then make use of existing Generative Adversarial Network-based (GAN) image-to-image translation and manifold learning models to alter the ethnicity from one to another. A systematic analysis is further conducted to assess the suitability of such datasets for FRS by studying the realistic skin-tone representation using Individual Typology Angle (ITA). Further, we also analyze the quality characteristics using existing Face image quality assessment (FIQA) approaches. We then provide a holistic FRS performance analysis using four different systems. Our findings pave the way for future research works in (i) developing both specific ethnicity and general (any to any) ethnicity alteration models, (ii) expanding such approaches to create databases with diverse skin tones, (iii) creating datasets representing various ethnicities which further can help in mitigating bias while addressing privacy concerns.


The SpongeNet Attack: Sponge Weight Poisoning of Deep Neural Networks

Lintelo, Jona te, Koffas, Stefanos, Picek, Stjepan

arXiv.org Artificial Intelligence

Sponge attacks aim to increase the energy consumption and computation time of neural networks deployed on hardware accelerators. Existing sponge attacks can be performed during inference via sponge examples or during training via Sponge Poisoning. Sponge examples leverage perturbations added to the model's input to increase energy and latency, while Sponge Poisoning alters the objective function of a model to induce inference-time energy/latency effects. In this work, we propose a novel sponge attack called SpongeNet. SpongeNet is the first sponge attack that is performed directly on the parameters of a pre-trained model. Our experiments show that SpongeNet can successfully increase the energy consumption of vision models with fewer samples required than Sponge Poisoning. Our experiments indicate that poisoning defenses are ineffective if not adjusted specifically for the defense against Sponge Poisoning (i.e., they decrease batch normalization bias values). Our work shows that SpongeNet is more effective on StarGAN than the state-of-the-art. Additionally, SpongeNet is stealthier than the previous Sponge Poisoning attack as it does not require significant changes in the victim model's weights. Our experiments indicate that the SpongeNet attack can be performed even when an attacker has access to only 1% of the entire dataset and reach up to 11% energy increase.


OT-driven Multi-Domain Unsupervised Ultrasound Image Artifact Removal using a Single CNN

Huh, Jaeyoung, Khan, Shujaat, Ye, Jong Chul

arXiv.org Machine Learning

Ultrasound imaging (US) often suffers from distinct image artifacts from various sources. Classic approaches for solving these problems are usually model-based iterative approaches that have been developed specifically for each type of artifact, which are often computationally intensive. Recently, deep learning approaches have been proposed as computationally efficient and high performance alternatives. Unfortunately, in the current deep learning approaches, a dedicated neural network should be trained with matched training data for each specific artifact type. This poses a fundamental limitation in the practical use of deep learning for US, since large number of models should be stored to deal with various US image artifacts. Inspired by the recent success of multi-domain image transfer, here we propose a novel, unsupervised, deep learning approach in which a single neural network can be used to deal with different types of US artifacts simply by changing a mask vector that switches between different target domains. Our algorithm is rigorously derived using an optimal transport (OT) theory for cascaded probability measures. Experimental results using phantom and in vivo data demonstrate that the proposed method can generate high quality image by removing distinct artifacts, which are comparable to those obtained by separately trained multiple neural networks.


MetalGAN: Multi-Domain Label-Less Image Synthesis Using cGANs and Meta-Learning

Fontanini, Tomaso, Iotti, Eleonora, Donati, Luca, Prati, Andrea

arXiv.org Machine Learning

Image synthesis is currently one of the most addressed image processing topic in computer vision and deep learning fields of study. Researchers have tackled this problem focusing their efforts on its several challenging problems, e.g. image quality and size, domain and pose changing, architecture of the networks, and so on. Above all, producing images belonging to different domains by using a single architecture is a very relevant goal for image generation. In fact, a single multi-domain network would allow greater flexibility and robustness in the image synthesis task than other approaches. This paper proposes a novel architecture and a training algorithm, which are able to produce multi-domain outputs using a single network. A small portion of a dataset is intentionally used, and there are no hard-coded labels (or classes). This is achieved by combining a conditional Generative Adversarial Network (cGAN) for image generation and a Meta-Learning algorithm for domain switch, and we called our approach MetalGAN. The approach has proved to be appropriate for solving the multi-domain problem and it is validated on facial attribute transfer, using CelebA dataset.


Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching

Chen, Ziliang, Yang, Zhanfu, Wang, Xiaoxi, Liang, Xiaodan, Yan, Xiaopeng, Li, Guanbin, Lin, Liang

arXiv.org Machine Learning

A broad range of cross-$m$-domain generation researches boil down to matching a joint distribution by deep generative models (DGMs). Hitherto algorithms excel in pairwise domains while as $m$ increases, remain struggling to scale themselves to fit a joint distribution. In this paper, we propose a domain-scalable DGM, i.e., MMI-ALI for $m$-domain joint distribution matching. As an $m$-domain ensemble model of ALIs \cite{dumoulin2016adversarially}, MMI-ALI is adversarially trained with maximizing Multivariate Mutual Information (MMI) w.r.t. joint variables of each pair of domains and their shared feature. The negative MMIs are upper bounded by a series of feasible losses that provably lead to matching $m$-domain joint distributions. MMI-ALI linearly scales as $m$ increases and thus, strikes a right balance between efficacy and scalability. We evaluate MMI-ALI in diverse challenging $m$-domain scenarios and verify its superiority.