AITopics

Technology: Information Technology > Artificial Intelligence (0.52)

Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

Assessing Generative Models via Precision and Recall

Neural Information Processing SystemsNov-21-2025, 03:57:37 GMT

As aresult, surrogate metrics are often used to assess the quality of the trained models.

artificial intelligence, machine learning, natural language, (14 more...)

Country: North America > Canada > Quebec > Montreal (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.38)

Neural Information Processing SystemsNov-20-2025, 18:23:42 GMT

Visual Object Networks: Image Generation with Disentangled 3D Representations

Jun-Yan Zhu, Zhoutong Zhang, Chengkai Zhang, Jiajun Wu, Antonio Torralba, Josh Tenenbaum, Bill Freeman

We present a new generative model, Visual Object Networks (VON), synthesizing natural images of objects with a disentangled 3D representation.

artificial intelligence, machine learning, texture, (16 more...)

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)

Neural Information Processing SystemsOct-2-2025, 03:50:51 GMT

PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Yikang LI, Tao Ma, Yeqi Bai, Nan Duan, Sining Wei, Xiaogang Wang

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, scene graph, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsAug-15-2025, 15:54:22 GMT

a851bd0d418b13310dd1e5e3ac7318ab-Supplemental.pdf

generator, target distribution, top-k training, (10 more...)

Technology: Information Technology > Artificial Intelligence (0.52)

Farahzadi, Yeganeh, Ansarinia, Morteza, Kekecs, Zoltan

YARE-GAN: Yet Another Resting State EEG-GAN

arXiv.org Artificial IntelligenceMar-4-2025

Generative Adversarial Networks (GANs) have shown promise in synthesising realistic neural data, yet their potential for unsupervised representation learning in resting-state EEG remains under explored. In this study, we implement a Wasserstein GAN with Gradient Penalty (WGAN-GP) to generate multi-channel resting-state EEG data and assess the quality of the synthesised signals through both visual and feature-based evaluations. Our results indicate that the model effectively captures the statistical and spectral characteristics of real EEG data, although challenges remain in replicating high-frequency oscillations in the frontal region. Additionally, we demonstrate that the Critic's learned representations can be fine-tuned for age group classification, achieving an out-of-sample accuracy, significantly better than a shuffled-label baseline. These findings suggest that generative models can serve not only as EEG data generators but also as unsupervised feature extractors, reducing the need for manual feature engineering. This study highlights the potential of GAN-based unsupervised learning for EEG analysis, suggesting avenues for more data-efficient deep learning applications in neuroscience.

dataset, eeg channel, representation, (14 more...)

2503.02636

Country: Europe > Hungary > Budapest > Budapest (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Das, Sagarnil, Walia, Pradeep

Enhancing Early Diabetic Retinopathy Detection through Synthetic DR1 Image Generation: A StyleGAN3 Approach

arXiv.org Artificial IntelligenceJan-1-2025

Diabetic Retinopathy (DR) is a leading cause of preventable blindness. Early detection at the DR1 stage is critical but is hindered by a scarcity of high-quality fundus images. This study uses StyleGAN3 to generate synthetic DR1 images characterized by microaneurysms with high fidelity and diversity. The aim is to address data scarcity and enhance the performance of supervised classifiers. A dataset of 2,602 DR1 images was used to train the model, followed by a comprehensive evaluation using quantitative metrics, including Frechet Inception Distance (FID), Kernel Inception Distance (KID), and Equivariance with respect to translation (EQ-T) and rotation (EQ-R). Qualitative assessments included Human Turing tests, where trained ophthalmologists evaluated the realism of synthetic images. Spectral analysis further validated image quality. The model achieved a final FID score of 17.29, outperforming the mean FID of 21.18 (95 percent confidence interval - 20.83 to 21.56) derived from bootstrap resampling. Human Turing tests demonstrated the model's ability to produce highly realistic images, though minor artifacts near the borders were noted. These findings suggest that StyleGAN3-generated synthetic DR1 images hold significant promise for augmenting training datasets, enabling more accurate early detection of Diabetic Retinopathy. This methodology highlights the potential of synthetic data in advancing medical imaging and AI-driven diagnostics.

artificial intelligence, machine learning, synthetic image, (17 more...)

2501.00954

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceDec-8-2023

Damage GAN: A Generative Model for Imbalanced Data

Anaissi, Ali, Jia, Yuanzhe, Braytee, Ali, Naji, Mohamad, Alyassine, Widad

This study delves into the application of Generative Adversarial Networks (GANs) within the context of imbalanced datasets. Our primary aim is to enhance the performance and stability of GANs in such datasets. In pursuit of this objective, we introduce a novel network architecture known as Damage GAN, building upon the ContraD GAN framework which seamlessly integrates GANs and contrastive learning. Through the utilization of contrastive learning, the discriminator is trained to develop an unsupervised representation capable of distinguishing all provided samples. Our approach draws inspiration from the straightforward framework for contrastive learning of visual representations (Sim-CLR), leading to the formulation of a distinctive loss function. We also explore the implementation of self-damaging contrastive learning (SD-CLR) to further enhance the optimization of the ContraD GAN model. Comparative evaluations against baseline models including the deep convolutional GAN (DCGAN) and ContraD GAN demonstrate the evident superiority of our proposed model, Damage GAN, in terms of generated image distribution, model stability, and image quality when applied to imbalanced datasets.

dataset, gan, learning, (15 more...)

doi: 10.1007/978-981-99-8696-5_4

2312.04862

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Berns, Sebastian, Colton, Simon, Guckelsberger, Christian

Towards Mode Balancing of Generative Models via Diversity Weights

arXiv.org Artificial IntelligenceJun-15-2023

Large data-driven image models are extensively used to support creative and artistic work. Under the currently predominant distribution-fitting paradigm, a dataset is treated as ground truth to be approximated as closely as possible. Yet, many creative applications demand a diverse range of output, and creators often strive to actively diverge from a given data distribution. We argue that an adjustment of modelling objectives, from pure mode coverage towards mode balancing, is necessary to accommodate the goal of higher output diversity. We present diversity weights, a training scheme that increases a model's output diversity by balancing the modes in the training dataset. First experiments in a controlled setting demonstrate the potential of our method. We discuss connections of our approach to diversity, equity, and inclusion in generative machine learning more generally, and computational creativity specifically. An implementation of our algorithm is available at https://github.com/sebastianberns/diversity-weights

artificial intelligence, machine learning, natural language, (19 more...)

2304.11961

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Finland (0.04)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJan-12-2023

Multimodal Deep Learning

Akkus, Cem, Chu, Luyang, Djakovic, Vladana, Jauch-Walser, Steffen, Koch, Philipp, Loss, Giacomo, Marquardt, Christopher, Moldovan, Marco, Sauter, Nadja, Schneider, Maximilian, Schulte, Rickmer, Urbanczyk, Karol, Goschenhofer, Jann, Heumann, Christian, Hvingelby, Rasmus, Schalk, Daniel, Aßenmacher, Matthias

FIGURE 1: LMU seal (left) style-transferred to Van Gogh's Sunflower painting (center) and blended with the prompt - Van Gogh, sunflowers - via CLIP+VGAN (right). In the last few years, there have been several breakthroughs in the methodologies used in Natural Language Processing (NLP) as well as Computer Vision (CV). Beyond these improvements on single-modality models, large-scale multimodal approaches have become a very active area of research. In this seminar, we reviewed these approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other Chapter 3.1 and Chapter 3.2), as well as models in which one modality is utilized to enhance representation learning for the other (Chapter 3.3 and Chapter 3.4). To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced (Chapter 3.5). Finally, we also cover other modalities (Chapter 4.1 and Chapter 4.2) as well as general-purpose multi-modal models (Chapter 4.3), which are able to handle different tasks on different modalities within one unified architecture.

large language model, machine learning, natural language, (23 more...)