Data Governance in the Age of Large-Scale Data-Driven Language Technology
Jernite, Yacine, Nguyen, Huu, Biderman, Stella, Rogers, Anna, Masoud, Maraim, Danchev, Valentin, Tan, Samson, Luccioni, Alexandra Sasha, Subramani, Nishant, Dupont, Gérard, Dodge, Jesse, Lo, Kyle, Talat, Zeerak, Johnson, Isaac, Radev, Dragomir, Nikpoor, Somaieh, Frohberg, Jörg, Gokaslan, Aaron, Henderson, Peter, Bommasani, Rishi, Mitchell, Margaret
–arXiv.org Artificial Intelligence
The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights. Our proposal is informed by prior work on distributed governance that accounts for human values and grounded by an international research collaboration that brings together researchers and practitioners from 60 countries. The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.
arXiv.org Artificial Intelligence
Nov-2-2022
- Country:
- Africa (0.04)
- South America > Uruguay
- North America
- Dominican Republic (0.04)
- United States
- Tennessee (0.04)
- District of Columbia > Washington (0.04)
- Washington > King County
- Seattle (0.04)
- New York
- New York County > New York City (0.04)
- Tompkins County > Ithaca (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Connecticut > New Haven County
- New Haven (0.04)
- California
- San Francisco County > San Francisco (0.14)
- San Diego County > San Diego (0.04)
- Santa Clara County
- Canada
- Quebec > Montreal (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Burnaby (0.04)
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Essex > Colchester (0.04)
- Cambridgeshire > Cambridge (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Germany > Saxony
- Leipzig (0.04)
- France > Île-de-France
- Denmark > Capital Region
- Copenhagen (0.04)
- United Kingdom > England
- Asia
- Japan (0.04)
- China (0.04)
- Singapore (0.04)
- East Asia (0.04)
- South Korea > Seoul
- Seoul (0.05)
- Middle East
- Genre:
- Overview (0.67)
- Research Report (0.64)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Media (0.92)
- Law
- Statutes (1.00)
- Civil Rights & Constitutional Law (1.00)
- Intellectual Property & Technology Law (0.93)
- Government > Regional Government
- Technology: