Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness
Hazirbas, Caner, Bang, Yejin, Yu, Tiezheng, Assar, Parisa, Porgali, Bilal, Albiero, Vítor, Hermanek, Stefan, Pan, Jacqueline, McReynolds, Emily, Bogen, Miranda, Fung, Pascale, Ferrer, Cristian Canton
–arXiv.org Artificial Intelligence
Several recent studies [8, 41, 55, 67, 75] propose various learning strategies for AI models to be well-calibrated across all protected subgroups, while others focus on collecting responsible datasets [57, 82, 124] to make sure evaluations of AI models are accurate and algorithmic bias can be measured while promoting data privacy. There has been much criticism regarding the design choice of the publicly used datasets, such as for ImageNet [36, 38, 56, 70]. Discussions are mostly focused on concerns around collecting sensitive data about people without their consent. Casual Conversations v1 [57] was one of the first benchmarks that was designed with permission from participants. However, that dataset has several limitations: samples were collected only in the US, the gender label is limited to three options, and only age and gender labels are self-provided with the permission of the participants.
arXiv.org Artificial Intelligence
Nov-10-2022
- Country:
- South America
- Oceania
- New Zealand (0.04)
- Australia > Victoria
- Melbourne (0.04)
- North America
- Canada (0.04)
- United States
- Utah (0.04)
- California (0.04)
- Ohio (0.04)
- New York > New York County
- New York City (0.06)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Europe
- United Kingdom (0.14)
- Netherlands (0.04)
- Middle East > Malta (0.04)
- Ireland (0.04)
- Iceland (0.04)
- Germany (0.04)
- Denmark (0.04)
- Austria (0.04)
- Asia
- China > Hong Kong (0.04)
- Pakistan (0.04)
- Nepal (0.04)
- Middle East > Jordan (0.04)
- India (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Genre:
- Overview (0.93)
- Research Report (0.70)
- Industry:
- Media (1.00)
- Information Technology > Security & Privacy (1.00)
- Law (0.93)
- Health & Medicine > Therapeutic Area (0.68)
- Government > Regional Government
- Technology:
- Information Technology
- Sensing and Signal Processing > Image Processing (1.00)
- Communications > Social Media (1.00)
- Artificial Intelligence
- Vision > Face Recognition (1.00)
- Natural Language (1.00)
- Speech (0.69)
- Machine Learning > Neural Networks
- Deep Learning (0.93)
- Information Technology