Datasheets for Datasets
Data plays a critical role in machine learning. Every machine learning model is trained and evaluated using data, quite often in the form of static datasets. The characteristics of these datasets fundamentally influence a model's behavior: a model is unlikely to perform well in the wild if its deployment context does not match its training or evaluation datasets, or if these datasets reflect unwanted societal biases. Mismatches like this can have especially severe consequences when machine learning models are used in high-stakes domains, such as criminal justice,1,13,24 hiring,19 critical infrastructure,11,21 and finance.18 Even in other domains, mismatches may lead to loss of revenue or public relations setbacks.
Nov-20-2021, 06:55:26 GMT
- AI-Alerts:
- 2021 > 2021-11 > AAAI AI-Alert for Nov 23, 2021 (1.00)
- Country:
- Asia > Middle East
- North America > United States
- California > Santa Clara County
- Palo Alto (0.04)
- District of Columbia > Washington (0.04)
- Maryland > Prince George's County
- College Park (0.14)
- Massachusetts > Hampshire County
- Amherst (0.04)
- New York
- New York County > New York City (0.05)
- Tompkins County > Ithaca (0.04)
- Washington > King County
- Seattle (0.14)
- California > Santa Clara County
- Industry:
- Information Technology > Security & Privacy (0.69)
- Law (1.00)
- Technology: