Goto

Collaborating Authors

 personae


Whose Personae? Synthetic Persona Experiments in LLM Research and Pathways to Transparency

Batzner, Jan, Stocker, Volker, Tang, Bingjun, Natarajan, Anusha, Chen, Qinhao, Schmid, Stefan, Kasneci, Gjergji

arXiv.org Artificial Intelligence

Synthetic personae experiments have become a prominent method in Large Language Model alignment research, yet the representativeness and ecological validity of these personae vary considerably between studies. Through a review of 63 peer-reviewed studies published between 2023 and 2025 in leading NLP and AI venues, we reveal a critical gap: task and population of interest are often underspecified in persona-based experiments, despite personalization being fundamentally dependent on these criteria. Our analysis shows substantial differences in user representation, with most studies focusing on limited sociodemographic attributes and only 35% discussing the representativeness of their LLM personae. Based on our findings, we introduce a persona transparency checklist that emphasizes representative sampling, explicit grounding in empirical data, and enhanced ecological validity. Our work provides both a comprehensive assessment of current practices and practical guidelines to improve the rigor and ecological validity of persona-based evaluations in language model alignment research.


Surface Fairness, Deep Bias: A Comparative Study of Bias in Language Models

Sorokovikova, Aleksandra, Chizhov, Pavel, Eremenko, Iuliia, Yamshchikov, Ivan P.

arXiv.org Artificial Intelligence

Modern language models are trained on large amounts of data. These data inevitably include controversial and stereotypical content, which contains all sorts of biases related to gender, origin, age, etc. As a result, the models express biased points of view or produce different results based on the assigned personality or the personality of the user. In this paper, we investigate various proxy measures of bias in large language models (LLMs). We find that evaluating models with pre-prompted personae on a multi-subject benchmark (MMLU) leads to negligible and mostly random differences in scores. However, if we reformulate the task and ask a model to grade the user's answer, this shows more significant signs of bias. Finally, if we ask the model for salary negotiation advice, we see pronounced bias in the answers. With the recent trend for LLM assistant memory and personalization, these problems open up from a different angle: modern LLM users do not need to pre-prompt the description of their persona since the model already knows their socio-demographics.


Concerns on Bias in Large Language Models when Creating Synthetic Personae

Haxvig, Helena A.

arXiv.org Artificial Intelligence

One immense concern relates to the existence of bias in the models, and creating synthetic personae has the potential to aid the investigation of how different forms of bias manifest in LLMs, by introducing a new method of testing. However, the black-box nature of a majority of these models, and their inability to express'opinions' contrary to overall LLM rules or fail-safes, introduces complexities in how to prompt the models to act out specific synthetic personae in various scenarios. This position paper introduces an exploration of a few fundamental questions: What are the benefits and drawbacks of using synthetic personae in HCI research, and how can we customize them beyond the limitations of current LLMs? The perspectives presented in this paper have sprung from the sub-study of a PhD project on Artificial Intelligence and Participatory Design [18]. The sub-study, currently a work in progress, aims at developing a novel method of adversarial testing [6, 13, 21] through the use of contextualized"real-life" vignettes [2, 16] prompted to the interfaces of multiple LLMs to identify potential bias, trying to open up the"black box" from a more qualitative human-computer interaction perspective[10]. 2 BIAS DETECTION IN LLM INTERFACES Research in various sub-fields has shown that human engagement in AI design, development, and evaluation, particularly in a qualitative manner, can ensure a focus on the socio-technical embeddedness of AI [3].


The Potential and Implications of Generative AI on HCI Education

Kharrufa, Ahmed, Johnson, Ian G

arXiv.org Artificial Intelligence

Generative AI (GAI) is impacting teaching and learning directly or indirectly across a range of subjects and disciplines. As educators, we need to understand the potential and limitations of AI in HCI education and ensure our graduating HCI students are aware of the potential and limitations of AI in HCI. In this paper, we report on the main pedagogical insights gained from the inclusion of generative AI into a 10 week undergraduate module. We designed the module to encourage student experimentation with GAI models as part of the design brief requirement and planned practical sessions and discussions. Our insights are based on replies to a survey sent out to the students after completing the module. Our key findings, for HCI educators, report on the use of AI as a persona for developing project ideas and creating resources for design, and AI as a mirror for reflecting students' understanding of key concepts and ideas and highlighting knowledge gaps. We also discuss potential pitfalls that should be considered and the need to assess students' literacies and assumptions of GAIs as pedagogical tools. Finally, we put forward the case for educators to take the opportunities GAI presents as an educational tool and be experimental, creative, and courageous in their practice. We end with a discussion of our findings in relation to the TPACK framework in HCI.


ChOiRe: Characterizing and Predicting Human Opinions with Chain of Opinion Reasoning

Do, Xuan Long, Kawaguchi, Kenji, Kan, Min-Yen, Chen, Nancy F.

arXiv.org Artificial Intelligence

Aligning language models (LMs) with human opinion is challenging yet vital to enhance their grasp of human values, preferences, and beliefs. We present ChOiRe, a four-step solution framework to predict human opinion that differentiates between the user explicit personae (i.e. demographic or ideological attributes) that are manually declared and implicit personae inferred from user historical opinions. Specifically, it consists of (i) an LM analyzing the user explicit personae to filter out irrelevant attributes; (ii) the LM ranking the implicit persona opinions into a preferential list; (iii) Chain-of-Opinion (CoO) reasoning, where the LM sequentially analyzes the explicit personae and the most relevant implicit personae to perform opinion prediction; (iv) and where ChOiRe executes Step (iii) CoO multiple times with increasingly larger lists of implicit personae to overcome insufficient personae information to infer a final result. ChOiRe achieves new state-of-the-art effectiveness with limited inference calls, improving previous LLM-based techniques significantly by 3.22%.


Guided scenarios with simulated expert personae: a remarkable strategy to perform cognitive work

Van Buren, David

arXiv.org Artificial Intelligence

Large language models (LLMs) trained on a substantial corpus of human knowledge and literature productively work with a large array of facts from that corpus. Surprisingly, they are also able to re-create the behaviors of personae that are captured within the corpus. By forming teams of simulated personae, supplying contexts that set the stage, and providing gentle prompts, one can move through scenarios that elicit expert behavior to perform meaningful cognitive work. The power of this strategy is demonstrated with two examples, one attacking factuality of LLM responses and the other reproducing a very recently published result in quantum optics.


From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models

Mendelsohn, Julia, Bras, Ronan Le, Choi, Yejin, Sap, Maarten

arXiv.org Artificial Intelligence

Dogwhistles are coded expressions that simultaneously convey one meaning to a broad audience and a second one, often hateful or provocative, to a narrow in-group; they are deployed to evade both political repercussions and algorithmic content moderation. For example, in the sentence 'we need to end the cosmopolitan experiment,' the word 'cosmopolitan' likely means 'worldly' to many, but secretly means 'Jewish' to a select few. We present the first large-scale computational investigation of dogwhistles. We develop a typology of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles with rich contextual information and examples, and analyze their usage in historical U.S. politicians' speeches. We then assess whether a large language model (GPT-3) can identify dogwhistles and their meanings, and find that GPT-3's performance varies widely across types of dogwhistles and targeted groups. Finally, we show that harmful content containing dogwhistles avoids toxicity detection, highlighting online risks of such coded language. This work sheds light on the theoretical and applied importance of dogwhistles in both NLP and computational social science, and provides resources for future research in modeling dogwhistles and mitigating their online harms.