Can Large Language Models Transform Computational Social Science?
Ziems, Caleb, Held, William, Shaikh, Omar, Chen, Jiaao, Zhang, Zhehao, Yang, Diyi
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS benchmarks. On taxonomic labeling tasks (classification), LLMs fail to outperform the best fine-tuned models but still achieve fair levels of agreement with humans. On free-form coding tasks (generation), LLMs produce explanations that often exceed the quality of crowdworkers' gold references. We conclude that the performance of today's LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challenging creative generation tasks (e.g., explaining the underlying attributes of a text). In summary, LLMs are posed to meaningfully participate in} social science analysis in partnership with humans.
arXiv.org Artificial Intelligence
Dec-7-2023
- Country:
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Oceania > Australia
- Queensland (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Maryland > Baltimore (0.04)
- Ohio (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Arizona > Maricopa County
- Phoenix (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Illinois > Cook County
- Chicago (0.04)
- California
- San Francisco County > San Francisco (0.14)
- Santa Clara County > Palo Alto (0.04)
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- New York > New York County
- New York City (0.14)
- Canada
- Quebec > Montreal (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Europe
- Bulgaria (0.04)
- Germany > Berlin (0.04)
- Middle East > Malta
- Port Region > Southern Harbour District > Valletta (0.04)
- Italy > Tuscany
- Florence (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- France
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Auvergne-Rhône-Alpes > Lyon
- Lyon (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Ukraine > Kyiv Oblast
- Kyiv (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- United Kingdom
- Scotland > City of Glasgow
- Glasgow (0.04)
- England > Cambridgeshire
- Cambridge (0.04)
- Scotland > City of Glasgow
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Afghanistan > Wardak Province (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Middle East
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Kantō
- Tokyo Metropolis Prefecture > Tokyo (0.14)
- Kyūshū & Okinawa > Kyūshū
- China
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- South America > Colombia
- Genre:
- Research Report > New Finding (0.93)
- Overview (0.88)
- Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Information Technology (1.00)
- Law (1.00)
- Media > News (1.00)
- Education (0.92)
- Banking & Finance (0.92)
- Leisure & Entertainment (0.67)
- Government > Regional Government
- Health & Medicine > Therapeutic Area
- Immunology (0.92)
- Infections and Infectious Diseases (0.67)
- Psychiatry/Psychology > Mental Health (0.67)
- Technology: