Goto

Collaborating Authors

 adversarial network


Neural Modulation for Flash Memory: An Unsupervised Learning Framework for Improved Reliability

Neural Information Processing Systems

The continued scaling of flash memory technology into smaller process nodes, combined with the increased information capacity of each flash cell (i.e, storing more bits per cell), has placed NAND flash memory at the forefront of modern storage technology.




AI-Augmented Pollen Recognition in Optical and Holographic Microscopy for Veterinary Imaging

Warshaneyan, Swarn S., Ivanovs, Maksims, Cugmas, Blaž, Bērziņa, Inese, Goldberga, Laura, Tamosiunas, Mindaugas, Kadiķis, Roberts

arXiv.org Machine Learning

We present a comprehensive study on fully automated pollen recognition across both conventional optical and digital in-line holographic microscopy (DIHM) images of sample slides. Visually recognizing pollen in unreconstructed holographic images remains challenging due to speckle noise, twin-image artifacts and substantial divergence from bright-field appearances. We establish the performance baseline by training YOLOv8s for object detection and MobileNetV3L for classification on a dual-modality dataset of automatically annotated optical and affinely aligned DIHM images. On optical data, detection mAP50 reaches 91.3% and classification accuracy reaches 97%, whereas on DIHM data, we achieve only 8.15% for detection mAP50 and 50% for classification accuracy. Expanding the bounding boxes of pollens in DIHM images over those acquired in aligned optical images achieves 13.3% for detection mAP50 and 54% for classification accuracy. To improve object detection in DIHM images, we employ a Wasserstein GAN with spectral normalization (WGAN-SN) to create synthetic DIHM images, yielding an FID score of 58.246. Mixing real-world and synthetic data at the 1.0 : 1.5 ratio for DIHM images improves object detection up to 15.4%. These results demonstrate that GAN-based augmentation can reduce the performance divide, bringing fully automated DIHM workflows for veterinary imaging a small but important step closer to practice.


On Translation and Reconstruction Guarantees of the Cycle-Consistent Generative Adversarial Networks

Neural Information Processing Systems

The task of unpaired image-to-image translation has witnessed a revolution with the introduction of the cycle-consistency loss to Generative Adversarial Networks (GANs). Numerous variants, with Cycle-Consistent Adversarial Network (CycleGAN) at their forefront, have shown remarkable empirical performance. The involvement of two unalike data spaces and the existence of multiple solution maps between them are some of the facets that make such architectures unique. In this study, we investigate the statistical properties of such unpaired data translator networks between distinct spaces, bearing the additional responsibility of cycle-consistency. In a density estimation setup, we derive sharp non-asymptotic bounds on the translation errors under suitably characterized models. This, in turn, points out sufficient regularity conditions that maps must obey to carry out successful translations. We further show that cycle-consistency is achieved as a consequence of the data being successfully generated in each space based on observations from the other. In a first-of-its-kind attempt, we also provide deterministic bounds on the cumulative reconstruction error. In the process, we establish tolerable upper bounds on the discrepancy responsible for ill-posedness in such networks.


Efficient Generation of Structured Objects with Constrained Adversarial Networks

Neural Information Processing Systems

Generative Adversarial Networks (GANs) struggle to generate structured objects like molecules and game maps. The issue is that structured objects must satisfy hard requirements (e.g., molecules must be chemically valid) that are difficult to acquire from examples alone. As a remedy, we propose Constrained Adversarial Networks (CANs), an extension of GANs in which the constraints are embedded into the model during training. This is achieved by penalizing the generator proportionally to the mass it allocates to invalid structures. In contrast to other generative models, CANs support efficient inference of valid structures (with high probability) and allows to turn on and off the learned constraints at inference time. CANs handle arbitrary logical constraints and leverage knowledge compilation techniques to efficiently evaluate the disagreement between the model and the constraints. Our setup is further extended to hybrid logical-neural constraints for capturing very complex constraints, like graph reachability. An extensive empirical analysis shows that CANs efficiently generate valid structures that are both high-quality and novel.


A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation

Jia, Zhigang, Wang, Duan, Wang, Hengkai, Xie, Yajun, Zhao, Meixiang, Zhao, Xiaoyu

arXiv.org Artificial Intelligence

Color image generation has a wide range of applications, but the existing generation models ignore the correlation among color channels, which may lead to chromatic aberration problems. In addition, the data distribution problem of color images has not been systematically elaborated and explained, so that there is still the lack of the theory about measuring different color images datasets. In this paper, we define a new quaternion Wasserstein distance and develop its dual theory. To deal with the quaternion linear programming problem, we derive the strong duality form with helps of quaternion convex set separation theorem and quaternion Farkas lemma. With using quaternion Wasserstein distance, we propose a novel Wasserstein quaternion generative adversarial network. Experiments demonstrate that this novel model surpasses both the (quaternion) generative adversarial networks and the Wasserstein generative adversarial network in terms of generation efficiency and image quality.


TraceTrans: Translation and Spatial Tracing for Surgical Prediction

Luo, Xiyu, Li, Haodong, Cheng, Xinxing, Zhao, He, Hu, Yang, Song, Xuan, Zhang, Tianyang

arXiv.org Artificial Intelligence

Image-to-image translation models have achieved notable success in converting images across visual domains and are increasingly used for medical tasks such as predicting post-operative outcomes and modeling disease progression. However, most existing methods primarily aim to match the target distribution and often neglect spatial correspondences between the source and translated images. This limitation can lead to structural inconsistencies and hallucinations, undermining the reliability and interpretability of the predictions. These challenges are accentuated in clinical applications by the stringent requirement for anatomical accuracy. In this work, we present TraceTrans, a novel deformable image translation model designed for post-operative prediction that generates images aligned with the target distribution while explicitly revealing spatial correspondences with the pre-operative input. The framework employs an encoder for feature extraction and dual decoders for predicting spatial deformations and synthesizing the translated image. The predicted deformation field imposes spatial constraints on the generated output, ensuring anatomical consistency with the source. Extensive experiments on medical cosmetology and brain MRI datasets demonstrate that TraceTrans delivers accurate and interpretable post-operative predictions, highlighting its potential for reliable clinical deployment.


Adversarial Ranking for Language Generation

Neural Information Processing Systems

Generative adversarial networks (GANs) have great successes on synthesizing data. However, the existing GANs restrict the discriminator to be a binary classifier, and thus limit their learning capacity for tasks that need to synthesize output with rich structures such as natural language descriptions. In this paper, we propose a novel generative adversarial network, RankGAN, for generating high-quality language descriptions. Rather than training the discriminator to learn and assign absolute binary predicate for individual data sample, the proposed RankGAN is able to analyze and rank a collection of human-written and machine-written sentences by giving a reference group. By viewing a set of data samples collectively and evaluating their quality through relative ranking scores, the discriminator is able to make better assessment which in turn helps to learn a better generator. The proposed RankGAN is optimized through the policy gradient technique. Experimental results on multiple public datasets clearly demonstrate the effectiveness of the proposed approach.