We Need to Measure Data Diversity in NLP -- Better and Broader
–arXiv.org Artificial Intelligence
Although diversity in NLP datasets has received growing attention, the question of how to measure it remains largely underexplored. This opinion paper examines the conceptual and methodological challenges of measuring data diversity and argues that interdisciplinary perspectives are essential for developing more fine-grained and valid measures.
arXiv.org Artificial Intelligence
Sep-23-2025
- Country:
- Asia
- Europe
- Denmark > North Jutland
- Aalborg (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Netherlands (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Austria > Vienna (0.14)
- Spain > Galicia
- A Coruña Province > Santiago de Compostela (0.04)
- Denmark > North Jutland
- North America
- Canada > Ontario
- Toronto (0.04)
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.05)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Washington > King County
- Seattle (0.04)
- Florida > Miami-Dade County
- Canada > Ontario
- Oceania > Australia
- South America > Chile
- Genre:
- Research Report (0.40)
- Industry:
- Education (0.46)
- Technology: