Emulating Public Opinion: A Proof-of-Concept of AI-Generated Synthetic Survey Responses for the Chilean Case

González-Bustamante, Bastián, Verelst, Nando, Cisternas, Carla

Sep-22-2025–arXiv.org Artificial Intelligence

Traditional public opinion surveys face a number of challenges and risks related to measurement and representation dimensions, including, for example, coverage error due to incomplete frames and hard-to-reach groups, sampling error resulting from finite samples and complex designs, nonresponse error stemming from low participation and interview fatigue, measurement error introduced by questionnaire wording, and processing errors in coding and post-survey adjustments, among others (Groves, 1989; Groves and Lyberg, 2010; Weisberg, 2005). These errors could be amplified by substantial financial, human, and logistical demands, such as time spent on instrument design, piloting, and fieldwork that often forces a cost-quality trade-off that may distort population inferences. Consequently, there is a growing demand in the social sciences and market research for methods that reduce burden and cost while maintaining and improving overall data quality. Against this backdrop, Large Language Models (LLMs), trained extensively on vast and diverse data, emerge as promising alternatives for new research possibilities and applied research, including handling the abovementioned survey research limitations and measurement and representation errors. Indeed, recent advances in generative artificial intelligence (AI) suggest LLMs could serve for a number of classification tasks, including the creation of synthetic samples, providing simulated responses reflective of broader societal attitudes and behaviours (Argyle et al., 2023; Gilardi et al., 2023; González-Bustamante, 2024). The synthetic samples specifically may leverage the ability of LLMs 2 to generate contextually informed responses based on individual-level demographic characteristics and attitudes, and, in this way, potentially emulate public opinion without direct interaction with human respondents. This methodological innovation opens new avenues for rapid data collection, experimentation with sensitive topics, and a deeper understanding of complex public opinion dynamics that complement or even partially substitute for traditional surveys. Thus, the primary objective of this working paper is to evaluate the effectiveness and reliability of LLM-generated synthetic survey responses in reflecting real-world public opinion in Chile. Specifically, we aim to assess the predictive accuracy of a number of state-of-the-art private and open-source LLMs by comparing their synthetic respondents against human probabilistic responses.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Sep-22-2025

arXiv.org PDF

Add feedback

Country:
- Antarctica (0.04)
- Europe
  - Netherlands > South Holland
    - Leiden (0.41)
  - Spain > Galicia
    - Madrid (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
- North America > United States
  - Illinois > Cook County
    - Chicago (0.04)
  - New York (0.04)
- South America
  - Chile (0.25)
  - Uruguay (0.04)

Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study (0.47)

Industry:
- Government (1.00)
- Health & Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.34)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found