Goto

Collaborating Authors

 resnet-101





3341f6f048384ec73a7ba2e77d2db48b-Paper.pdf

Neural Information Processing Systems

Instance segmentation, which seeks to obtain both class and instance labels for each pixelinthe input image, isachallenging task incomputer vision. State-ofthe-art algorithms often employ a search-based strategy, which first divides the output image with a regular grid and generate proposals at each grid cell, then the proposals are classified and boundaries refined.


Automated Monitoring of Cultural Heritage Artifacts Using Semantic Segmentation

Ranieri, Andrea, Palmieri, Giorgio, Biasotti, Silvia

arXiv.org Artificial Intelligence

This paper addresses the critical need for automated crack detection in the preservation of cultural heritage through semantic segmentation. We present a comparative study of U-Net architectures, using various convolutional neural network (CNN) encoders, for pixel-level crack identification on statues and monuments. A comparative quantitative evaluation is performed on the test set of the OmniCrack30k dataset [1] using popular segmentation metrics including Mean Intersection over Union (mIoU), Dice coefficient, and Jaccard index. This is complemented by an out-of-distribution qualitative evaluation on an unlabeled test set of real-world cracked statues and monuments. Our findings provide valuable insights into the capabilities of different CNN- based encoders for fine-grained crack segmentation. We show that the models exhibit promising generalization capabilities to unseen cultural heritage contexts, despite never having been explicitly trained on images of statues or monuments.





GeMix: Conditional GAN-Based Mixup for Improved Medical Image Augmentation

Carlesso, Hugo, Patulea, Maria Eliza, Garouani, Moncef, Ionescu, Radu Tudor, Mothe, Josiane

arXiv.org Artificial Intelligence

Abstract--Mixup has become a popular augmentation strategy for image classification, yet its naive pixel-wise interpolation often produces unrealistic images that can hinder learning, particularly in high-stakes medical applications. We propose GeMix, a two-stage framework that replaces heuristic blending with a learned, label-aware interpolation powered by class-conditional GANs. First, a StyleGAN2-ADA generator is trained on the target dataset. During augmentation, we sample two label vectors from Dirichlet priors biased toward different classes and blend them via a Beta-distributed coefficient. Then, we condition the generator on this soft label to synthesize visually coherent images that lie along a continuous class manifold. When combined with real data, our method increases macro-F1 over traditional mixup for all backbones, reducing the false negative rate for COVID-19 detection. GeMix is thus a drop-in replacement for pixel-space mixup, delivering stronger regularization and greater semantic fidelity, without disrupting existing training pipelines.