GRAID: Synthetic Data Generation with Geometric Constraints and Multi-Agentic Reflection for Harmful Content Detection

Rad, Melissa Kazemi, Purpura, Alberto, Kumar, Himanshu, Chen, Emily, Sorower, Mohammad Shahed

Aug-26-2025–arXiv.org Artificial Intelligence

We address the problem of data scarcity in harmful text classification for guardrailing applications and introduce GRAID (Geometric and Reflective AI-Driven Data Augmentation), a novel pipeline that leverages Large Language Models (LLMs) for dataset augmentation. GRAID consists of two stages: (i) generation of geometrically controlled examples using a constrained LLM, and (ii) augmentation through a multi-agentic reflective process that promotes stylistic diversity and uncovers edge cases. This combination enables both reliable coverage of the input space and nuanced exploration of harmful content. Using two benchmark data sets, we demonstrate that augmenting a harmful text classification dataset with GRAID leads to significant improvements in downstream guardrail model performance.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

Aug-26-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report
  - New Finding (0.67)
  - Experimental Study (0.46)

Industry:
- Law Enforcement & Public Safety (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Law (0.94)
- Education (0.93)
- Health & Medicine > Therapeutic Area (0.68)
- Media > News (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Constraint-Based Reasoning (1.00)
  - Natural Language > Large Language Model (1.00)