AITopics | survey response

Collaborating Authors

survey response

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Questioning the Survey Responses of Large Language Models

Neural Information Processing SystemsMar-20-2026, 13:54:54 GMT

Surveys have recently gained popularity as a tool to study large language models. By comparing models' survey responses to those of different human reference populations, researchers aim to infer the demographics, political opinions, or values best represented by current language models. In this work, we critically examine language models' survey responses on the basis of the well-established American Community Survey by the U.S. Census Bureau. Evaluating 43 different language models using de-facto standard prompting methodologies, we establish two dominant patterns. First, models' responses are governed by ordering and labeling biases, for example, towards survey responses labeled with the letter "A".

artificial intelligence, large language model, proceedings, (4 more...)

Neural Information Processing Systems

Country: North America > United States (0.84)

Industry: Government > Regional Government > North America Government > United States Government (0.84)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.31)

Add feedback

515c62809e0a29729d7eec26e2916fc0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 04:01:24 GMT

entropy, language model, survey question, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.94)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QSTN: A Modular Framework for Robust Questionnaire Inference with Large Language Models

Kreutner, Maximilian, Rupprecht, Jens, Ahnert, Georg, Salem, Ahmed, Strohmaier, Markus

arXiv.org Artificial IntelligenceDec-10-2025

We introduce QSTN, an open-source Python framework for systematically generating responses from questionnaire-style prompts to support in-silico surveys and annotation tasks with large language models (LLMs). QSTN enables robust evaluation of questionnaire presentation, prompt perturbations, and response generation methods. Our extensive evaluation ($>40 $ million survey responses) shows that question structure and response generation methods have a significant impact on the alignment of generated survey responses with human answers, and can be obtained for a fraction of the compute cost. In addition, we offer a no-code user interface that allows researchers to set up robust experiments with LLMs without coding knowledge. We hope that QSTN will support the reproducibility and reliability of LLM-based research in the future.

computational linguistic, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2512.08646

Country:

Asia (1.00)
North America > United States (0.94)
North America > Mexico > Mexico City (0.14)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Prompt Perturbations Reveal Human-Like Biases in Large Language Model Survey Responses

Rupprecht, Jens, Ahnert, Georg, Strohmaier, Markus

arXiv.org Artificial IntelligenceOct-17-2025

Large Language Models (LLMs) are increasingly used as proxies for human subjects in social science surveys, but their reliability and susceptibility to known human-like response biases, such as central tendency, opinion floating and primacy bias are poorly understood. This work investigates the response robustness of LLMs in normative survey contexts, we test nine LLMs on questions from the World Values Survey (WVS), applying a comprehensive set of ten perturbations to both question phrasing and answer option structure, resulting in over 167,000 simulated survey interviews. In doing so, we not only reveal LLMs' vulnerabilities to perturbations but also show that all tested models exhibit a consistent recency bias, disproportionately favoring the last-presented answer option. While larger models are generally more robust, all models remain sensitive to semantic variations like paraphrasing and to combined perturbations. This underscores the critical importance of prompt design and robustness testing when using LLMs to generate synthetic survey data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.07188

Country:

North America (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Simulating Persuasive Dialogues on Meat Reduction with Generative Agents

Ahnert, Georg, Wurth, Elena, Strohmaier, Markus, Mata, Jutta

arXiv.org Artificial IntelligenceOct-14-2025

Meat reduction benefits human and planetary health, but social norms keep meat central in shared meals. To date, the development of communication strategies that promote meat reduction while minimizing social costs has required the costly involvement of human participants at each stage of the process. We present work in progress on simulating multi-round dialogues on meat reduction between Generative Agents based on large language models (LLMs). We measure our main outcome using established psychological questionnaires based on the Theory of Planned Behavior and additionally investigate Social Costs. We find evidence that our preliminary simulations produce outcomes that are (i) consistent with theoretical expectations; and (ii) valid when compared to data from previous studies with human participants. Generative agent-based models are a promising tool for identifying novel communication strategies on meat reduction -- tailored to highly specific participant groups -- to then be tested in subsequent studies with human participants.

large language model, meat consumption, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.36190/2025.30

2504.04872

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Health & Medicine > Consumer Health (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)

Add feedback

Questioning the Survey Responses of Large Language Models Ricardo Dominguez-Olmedo

Neural Information Processing SystemsOct-10-2025, 02:26:44 GMT

Surveys have recently gained popularity as a tool to study large language models.

entropy, language model, survey question, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.94)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry: Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Retentive Relevance: Capturing Long-Term User Value in Recommendation Systems

Bakhshi, Saeideh, Nguyen, Phuong Mai, Schiller, Robert, Xu, Tiantian, Kodandapani, Pawan, Levine, Andrew, Simpson, Cayman, Wang, Qifan

arXiv.org Artificial IntelligenceOct-10-2025

Recommendation systems have traditionally relied on short-term engagement signals, such as clicks and likes, to personalize content. However, these signals are often noisy, sparse, and insufficient for capturing long-term user satisfaction and retention. We introduce Retentive Relevance, a novel content-level survey-based feedback measure that directly assesses users' intent to return to the platform for similar content. Unlike other survey measures that focus on immediate satisfaction, Retentive Relevance targets forward-looking behavioral intentions, capturing longer term user intentions and providing a stronger predictor of retention. We validate Retentive Relevance using psychometric methods, establishing its convergent, discriminant, and behavioral validity. Through large-scale offline modeling, we show that Retentive Relevance significantly outperforms both engagement signals and other survey measures in predicting next-day retention, especially for users with limited historical engagement. We develop a production-ready proxy model that integrates Retentive Relevance into the final stage of a multi-stage ranking system on a social media platform. Calibrated score adjustments based on this model yield substantial improvements in engagement, and retention, while reducing exposure to low-quality content, as demonstrated by large-scale A/B experiments. This work provides the first empirically validated framework linking content-level user perceptions to retention outcomes in production systems. We offer a scalable, user-centered solution that advances both platform growth and user experience. Our work has broad implications for responsible AI development.

artificial intelligence, proceedings, retentive relevance, (13 more...)

arXiv.org Artificial Intelligence

2510.07621

Country:

North America > United States > California (0.47)
North America > United States > New York > New York County > New York City (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

Scalable and consistent few-shot classification of survey responses using text embeddings

Mjaaland, Jonas Timmann, Kreutzer, Markus Fleten, Tyseng, Halvor, Fussell, Rebeckah K., Passante, Gina, Holmes, N. G., Malthe-Sørenssen, Anders, Odden, Tor Ole B.

arXiv.org Artificial IntelligenceAug-28-2025

Qualitative analysis of open-ended survey responses is a commonly-used research method in the social sciences, but traditional coding approaches are often time-consuming and prone to inconsistency. Existing solutions from Natural Language Processing such as supervised classifiers, topic modeling techniques, and generative large language models have limited applicability in qualitative analysis, since they demand extensive labeled data, disrupt established qualitative workflows, and/or yield variable results. In this paper, we introduce a text embedding-based classification framework that requires only a handful of examples per category and fits well with standard qualitative workflows. When benchmarked against human analysis of a conceptual physics survey consisting of 2899 open-ended responses, our framework achieves a Cohen's Kappa ranging from 0.74 to 0.83 as compared to expert human coders in an exhaustive coding scheme. We further show how performance of this framework improves with fine-tuning of the text embedding model, and how the method can be used to audit previously-analyzed datasets. These findings demonstrate that text embedding-assisted coding can flexibly scale to thousands of responses without sacrificing interpretability, opening avenues for deductive qualitative analysis at scale.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.19836

Country:

North America > United States (0.46)
Europe (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Modeling User Behavior from Adaptive Surveys with Supplemental Context

Shukla, Aman, Scantlebury, Daniel Patrick, Kumar, Rishabh

arXiv.org Artificial IntelligenceJul-29-2025

Modeling user behavior is critical across many industries where understanding preferences, intent, or decisions informs personalization, targeting, and strategic outcomes. Surveys have long served as a classical mechanism for collecting such behavioral data due to their interpretability, structure, and ease of deployment. However, surveys alone are inherently limited by user fatigue, incomplete responses, and practical constraints on their length making them insufficient for capturing user behavior. In this work, we present LANTERN (Late-Attentive Network for Enriched Response Modeling), a modular architecture for modeling user behavior by fusing adaptive survey responses with supplemental contextual signals. We demonstrate the architectural value of maintaining survey primacy through selective gating, residual connections and late fusion via cross-attention, treating survey data as the primary signal while incorporating external modalities only when relevant. LANTERN outperforms strong survey-only baselines in multi-label prediction of survey responses. We further investigate threshold sensitivity and the benefits of selective modality reliance through ablation and rare/frequent attribute analysis. LANTERN's modularity supports scalable integration of new encoders and evolving datasets. This work provides a practical and extensible blueprint for behavior modeling in survey-centric applications.

artificial intelligence, lantern, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.20919

Country: North America > United States (0.15)

Genre: Questionnaire & Opinion Survey (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.68)

Add feedback

Leveraging Interview-Informed LLMs to Model Survey Responses: Comparative Insights from AI-Generated and Human Data

Zhang, Jihong, Liang, Xinya, Deng, Anqi, Bonge, Nicole, Tan, Lin, Zhang, Ling, Zarrett, Nicole

arXiv.org Artificial IntelligenceMay-29-2025

Mixed methods research integrates quantitative and qualitative data but faces challenges in aligning their distinct structures, particularly in examining measurement characteristics and individual response patterns. Advances in large language models (LLMs) offer promising solutions by generating synthetic survey responses informed by qualitative data. This study investigates whether LLMs, guided by personal interviews, can reliably predict human survey responses, using the Behavioral Regulations in Exercise Questionnaire (BREQ) and interviews from after-school program staff as a case study. Results indicate that LLMs capture overall response patterns but exhibit lower variability than humans. Incorporating interview data improves response diversity for some models (e.g., Claude, GPT), while well-crafted prompts and low-temperature settings enhance alignment between LLM and human responses. Demographic information had less impact than interview content on alignment accuracy. These findings underscore the potential of interview-informed LLMs to bridge qualitative and quantitative methodologies while revealing limitations in response variability, emotional interpretation, and psychometric fidelity. Future research should refine prompt design, explore bias mitigation, and optimize model settings to enhance the validity of LLM-generated survey data in social science research.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.21997

Country: North America > United States > Arkansas (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.89)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback