Large Language Model as Attributed Training Data Generator: A T ale of Diversity and Bias Yue Y u
–Neural Information Processing Systems
Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally rely on simple class-conditional prompts, which may limit the diversity of the generated data and inherit systematic biases of LLM. Thus, we investigate training data generation with diversely attributed prompts (e.g.,
Neural Information Processing Systems
Oct-9-2025, 04:43:35 GMT
- Country:
- Africa (0.04)
- South America (0.04)
- Oceania > New Zealand (0.04)
- Europe > Germany (0.04)
- North America
- Mexico (0.04)
- United States
- District of Columbia > Washington (0.04)
- Illinois (0.04)
- Colorado (0.04)
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Genre:
- Research Report > New Finding (0.92)
- Personal (0.67)
- Industry:
- Law (1.00)
- Banking & Finance > Economy (1.00)
- Education (1.00)
- Consumer Products & Services > Restaurants (0.68)
- Social Sector (0.67)
- Energy > Renewable (0.67)
- Leisure & Entertainment
- Games > Computer Games (1.00)
- Sports > Baseball (0.67)
- Information Technology
- Security & Privacy (1.00)
- Services (0.67)
- Health & Medicine
- Therapeutic Area (1.00)
- Pharmaceuticals & Biotechnology (1.00)
- Consumer Health (1.00)
- Government
- Media
- Technology: