A Survey and Datasheet Repository of Publicly Available US Criminal Justice Datasets
–Neural Information Processing Systems
Predictive tools are becoming widely used in police, courts, and prison systems worldwide. Criminal justice is thus an increasingly important application domain for machine learning and algorithmic fairness. A few benchmark datasets have received significant attention--e.g., COMPAS [1]--but often without proper consideration of the domain context [2]. We conduct a survey of publicly available criminal justice datasets, highlight their potential uses, discuss context, and identify limitations and gaps in the current landscape. We provide datasheets [3] for 15 datasets, and make them available via a public repository. We compare the surveyed datasets across several dimensions, including size, population coverage, and potential use, highlighting possible concerns. We hope this work provides a useful starting point for researchers looking for appropriate datasets related to criminal justice, and wish to further grow the repository in a broader community effort.
Neural Information Processing Systems
Mar-27-2025, 13:01:51 GMT
- Country:
- Europe > United Kingdom
- England (0.28)
- North America > United States (1.00)
- Europe > United Kingdom
- Genre:
- Questionnaire & Opinion Survey (0.68)
- Research Report (1.00)
- Industry:
- Technology: