social science
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Portugal > Porto > Porto (0.04)
- (3 more...)
Knowing Your Uncertainty -- On the application of LLM in social sciences
Zhang, Bolun, Li, Linzhuo, Chen, Yunqi, Zhao, Qinlin, Zhu, Zihan, Yi, Xiaoyuan, Xie, Xing
Large language models (LLMs) are rapidly being integrated into computational social science research, yet their blackboxed training and designed stochastic elements in inference pose unique challenges for scientific inquiry. This article argues that applying LLMs to social scientific tasks requires explicit assessment of uncertainty-an expectation long established in both quantitative methodology in the social sciences and machine learning. We introduce a unified framework for evaluating LLM uncertainty along two dimensions: the task type (T), which distinguishes between classification, short-form, and long-form generation, and the validation type (V), which captures the availability of reference data or evaluative criteria. Drawing from both computer science and social science literature, we map existing uncertainty quantification (UQ) methods to this T-V typology and offer practical recommendations for researchers. Our framework provides both a methodological safeguard and a practical guide for integrating LLMs into rigorous social science research.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > France (0.04)
- Asia > Singapore (0.04)
- (3 more...)
Interview with Frida Hartman: Studying bias in AI-based recruitment tools
In a new series of interviews, we're meeting some of the PhD students that were selected to take part in the Doctoral Consortium at the European Conference on Artificial Intelligence (ECAI-2025) . In the second interview of the series, we caught up with Frida Hartman to find out how her PhD is going so far, and plans for the next steps in her investigations. Frida, along with co-authors Mario Mirabile and Michele Dusi, was also the winner of the ECAI-2025 Diversity & Inclusion Competition, for work entitled . This award was presented at the closing ceremony of the conference. Could start by giving us a quick introduction to yourself and the topic that you're working on?
Understanding nature and nurture: Statistical and AI innovations uncover how genes and environment shape human health Science
What makes us who we are? Is it our DNA, passed down through generations, or the environment that shapes our lives? This question--how nature and nurture combine to influence health and behavior--has long captured my curiosity. As I grew up in a multigenerational household, I was struck by the story of my two uncles, identical twins who were genetically indistinguishable but who lived out very different health journeys. One developed severe cardiovascular disease by his early forties; the other stayed healthy into his sixties. What separated them was not biology--it was environment.
Recommendations and Reporting Checklist for Rigorous & Transparent Human Baselines in Model Evaluations
Wei, Kevin L., Paskov, Patricia, Dev, Sunishchal, Byun, Michael J., Reuel, Anka, Roberts-Gaal, Xavier, Calcott, Rachel, Coxon, Evie, Deshpande, Chinmay
In this position paper, we argue that human baselines in foundation model evaluations must be more rigorous and more transparent to enable meaningful comparisons of human vs. AI performance, and we provide recommendations and a reporting checklist towards this end. Human performance baselines are vital for the machine learning community, downstream users, and policymakers to interpret AI evaluations. Models are often claimed to achieve "super-human" performance, but existing baselining methods are neither sufficiently rigorous nor sufficiently well-documented to robustly measure and assess performance differences. Based on a meta-review of the measurement theory and AI evaluation literatures, we derive a framework with recommendations for designing, executing, and reporting human baselines. We synthesize our recommendations into a checklist that we use to systematically review 115 human baselines (studies) in foundation model evaluations and thus identify shortcomings in existing baselining methods; our checklist can also assist researchers in conducting human baselines and reporting results. We hope our work can advance more rigorous AI evaluation practices that can better serve both the research community and policymakers. Data is available at: https://github.com/kevinlwei/human-baselines
- Europe > Austria > Vienna (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- (29 more...)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (0.93)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (6 more...)
Artificially intelligent agents in the social and behavioral sciences: A history and outlook
Holme, Petter, Tsvetkova, Milena
We review the historical development and current trends of artificially intelligent agents (agentic AI) in the social and behavioral sciences: from the first programmable computers, and social simulations soon thereafter, to today's experiments with large language models. This overview emphasizes the role of AI in the scientific process and the changes brought about, both through technological advancements and the broader evolution of science from around 1950 to the present. Some of the specific points we cover include: the challenges of presenting the first social simulation studies to a world unaware of computers, the rise of social systems science, intelligent game theoretic agents, the age of big data and the epistemic upheaval in its wake, and the current enthusiasm around applications of generative AI, and many other topics. A pervasive theme is how deeply entwined we are with the technologies we use to understand ourselves.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (20 more...)
- Overview (1.00)
- Research Report > Experimental Study (0.67)
- Information Technology (1.00)
- Government (1.00)
- Health & Medicine (0.93)
- (2 more...)
Evaluating the Use of Large Language Models as Synthetic Social Agents in Social Science Research
Large Language Models (LLMs) are being increasingly used as synthetic agents in social science, in applications ranging from augmenting survey responses to powering multi-agent simulations. This paper outlines cautions that should be taken when interpreting LLM outputs and proposes a pragmatic reframing for the social sciences in which LLMs are used as high-capacity pattern matchers for quasi-predictive interpolation under explicit scope conditions and not as substitutes for probabilistic inference. Practical guardrails such as independent draws, preregistered human baselines, reliability-aware validation, and subgroup calibration, are introduced so that researchers may engage in useful prototyping and forecasting while avoiding category errors.
- North America > United States (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
The Emergence of Social Science of Large Language Models
The social science of large language models (LLMs) examines how these systems evoke mind attributions, interact with one another, and transform human activity and institutions. We conducted a systematic review of 270 studies, combining text embeddings, unsupervised clustering and topic modeling to build a computational taxonomy. Three domains emerge organically across the reviewed literature. LLM as Social Minds examines whether and when models display behaviors that elicit attributions of cognition, morality and bias, while addressing challenges such as test leakage and surface cues. LLM Societies examines multi-agent settings where interaction protocols, architectures and mechanism design shape coordination, norms, institutions and collective epistemic processes. LLM-Human Interactions examines how LLMs reshape tasks, learning, trust, work and governance, and how risks arise at the human-AI interface. This taxonomy provides a reproducible map of a fragmented field, clarifies evidentiary standards across levels of analysis, and highlights opportunities for cumulative progress in the social science of artificial intelligence.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
- Health & Medicine > Therapeutic Area (0.46)
- Education > Curriculum (0.46)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)
Beyond Static Responses: Multi-Agent LLM Systems as a New Paradigm for Social Science Research
Haase, Jennifer, Pokutta, Sebastian
As large language models (LLMs) transition from static tools to fully agentic systems, their potential for transforming social science research has become increasingly evident. This paper introduces a structured framework for understanding the diverse applications of LLM-based agents, ranging from simple data processors to complex, multi-agent systems capable of simulating emergent social dynamics. By mapping this developmental continuum across six levels, the paper clarifies the technical and methodological boundaries between different agentic architectures, providing a comprehensive overview of current capabilities and future potential. It highlights how lower-tier systems streamline conventional tasks like text classification and data annotation, while higher-tier systems enable novel forms of inquiry, including the study of group dynamics, norm formation, and large-scale social processes. However, these advancements also introduce significant challenges, including issues of reproducibility, ethical oversight, and the risk of emergent biases. The paper critically examines these concerns, emphasizing the need for robust validation protocols, interdisciplinary collaboration, and standardized evaluation metrics. It argues that while LLM-based agents hold transformative potential for the social sciences, realizing this promise will require careful, context-sensitive deployment and ongoing methodological refinement. The paper concludes with a call for future research that balances technical innovation with ethical responsibility, encouraging the development of agentic systems that not only replicate but also extend the frontiers of social science, offering new insights into the complexities of human behavior.
- Europe > Germany > Berlin (0.40)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Research Report > New Finding (0.92)
- Government (1.00)
- Education (1.00)
- Leisure & Entertainment > Games > Computer Games (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Too Open for Opinion? Embracing Open-Endedness in Large Language Models for Social Simulation
Ma, Bolei, Cao, Yong, Sen, Indira, Haensch, Anna-Carolina, Kreuter, Frauke, Plank, Barbara, Hershcovich, Daniel
Large Language Models (LLMs) are increasingly used to simulate public opinion and other social phenomena. Most current studies constrain these simulations to multiple-choice or short-answer formats for ease of scoring and comparison, but such closed designs overlook the inherently generative nature of LLMs. In this position paper, we argue that open-endedness, using free-form text that captures topics, viewpoints, and reasoning processes "in" LLMs, is essential for realistic social simulation. Drawing on decades of survey-methodology research and recent advances in NLP, we argue why this open-endedness is valuable in LLM social simulations, showing how it can improve measurement and design, support exploration of unanticipated views, and reduce researcher-imposed directive bias. It also captures expressiveness and individuality, aids in pretesting, and ultimately enhances methodological utility. We call for novel practices and evaluation frameworks that leverage rather than constrain the open-ended generative diversity of LLMs, creating synergies between NLP and social science.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States > California > Ventura County > Thousand Oaks (0.14)
- (20 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Health & Medicine (0.88)
- Government (0.67)
- Education (0.66)