Towards Continual Expansion of Data Coverage: Automatic Text-guided Edge-case Synthesis
–arXiv.org Artificial Intelligence
The performance of deep neural networks is strongly influenced by the quality of their training data. However, mitigating dataset bias by manually curating challenging edge cases remains a major bottleneck. To address this, we propose an automated pipeline for text-guided edge-case synthesis. Our approach employs a Large Language Model, fine-tuned via preference learning, to rephrase image captions into diverse textual prompts that steer a Text-to-Image model toward generating difficult visual scenarios. Evaluated on the FishEye8K object detection benchmark, our method achieves superior robustness, surpassing both naive augmentation and manually engineered prompts. This work establishes a scalable framework that shifts data curation from manual effort to automated, targeted synthesis, offering a promising direction for developing more reliable and continuously improving AI systems. Code is available at https://github.com/gokyeongryeol/ATES.
arXiv.org Artificial Intelligence
Oct-1-2025
- Country:
- Asia > South Korea > Seoul > Seoul (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Transportation > Ground > Road (0.68)
- Technology: