Understanding the Detrimental Class-level Effects of Data Augmentation Mark Ibrahim 2 Randall Balestriero 2 Diane Bouchacourt 2

May-28-2025, 21:59:04 GMT–Neural Information Processing Systems

Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. There has been little progress in resolving class-level accuracy drops due to a limited understanding of these effects. In this work, we present a framework for understanding how DA interacts with class-level learning dynamics. Using higherquality multi-label annotations on ImageNet, we systematically categorize the affected classes and find that the majority are inherently ambiguous, co-occur, or involve fine-grained distinctions, while DA controls the model's bias towards one of the closely related classes. While many of the previously reported performance drops are explained by multi-label annotations, our analysis of class confusions reveals other sources of accuracy degradation. We show that simple class-conditional augmentation strategies informed by our framework improve performance on the negatively affected classes.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

May-28-2025, 21:59:04 GMT

Conferences PDF

Add feedback

Country:
- Europe (0.14)

Genre:
- Research Report (1.00)

Industry:
- Leisure & Entertainment (0.67)
- Transportation > Ground (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)
    - Natural Language (1.00)
    - Representation & Reasoning (1.00)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (0.87)