Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
–Neural Information Processing Systems
Fine-grained visual classification (FGVC) involves classifying closely related subcategories. This task is inherently difficult due to the subtle differences between classes and the high intra-class variance. Moreover, FGVC datasets are typically small and challenging to gather, thus highlighting a significant need for effective data augmentation.Recent advancements in text-to-image diffusion models have introduced new possibilities for data augmentation in image classification. While these models have been used to generate training data for classification tasks, their effectiveness in full-dataset training of FGVC models remains under-explored. Recent techniques that rely on text-to-image generation or Img2Img methods, such as SDEdit, often struggle to generate images that accurately represent the class while modifying them to a degree that significantly increases the dataset's diversity.
Neural Information Processing Systems
May-26-2025, 19:18:10 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Machine Learning (1.00)
- Information Technology > Artificial Intelligence