Goto

Collaborating Authors

 pennebaker


Exploring Self-Identified Counseling Expertise in Online Support Forums

Lahnala, Allison, Zhao, Yuntian, Welch, Charles, Kummerfeld, Jonathan K., An, Lawrence, Resnicow, Kenneth, Mihalcea, Rada, Pérez-Rosas, Verónica

arXiv.org Artificial Intelligence

A growing number of people engage in online health forums, making it important to understand the quality of the advice they receive. In this paper, we explore the role of expertise in responses provided to help-seeking posts regarding mental health. We study the differences between (1) interactions with peers; and (2) interactions with self-identified mental health professionals. First, we show that a classifier can distinguish between these two groups, indicating that their language use does in fact differ. To understand this difference, we perform several analyses addressing engagement aspects, including whether their comments engage the support-seeker further as well as linguistic aspects, such as dominant language and linguistic style matching. Our work contributes toward the developing efforts of understanding how health experts engage with health information- and support-seekers in social networks. More broadly, it is a step toward a deeper understanding of the styles of interactions that cultivate supportive engagement in online communities.


How good is GPT at writing political speeches for the White House?

Savoy, Jacques

arXiv.org Artificial Intelligence

Using large language models (LLMs), computers are able to generate a written text in response to a us er request. As this pervasive technology can be applied in numerous contexts, this study analyses the written style of one LLM called GPT by comparing its generated speeches with those of the recent US presidents. To achieve this objective, the State of the Union (SOTU) addresses written by Reagan to Biden are contrasted to those produced by both GPT-3.5 and GPT-4.o versions. Compared to US presidents, GPT tends to overuse the lemma "we" and produce shorter messages with, on average, longer sentences. Moreover, GPT opts for an optimistic tone, opting more often for political (e.g., president, Congress), symbolic (e.g., freedom), and abstract terms (e.g., freedom). Even when imposing an author's style to GPT, the resulting speech remains distinct from addresses written by the target author. Finally, the two GPT versions present distinct characteristics, but both appear overall dissimilar to true presidential messages.


Journalists, Emotions, and the Introduction of Generative AI Chatbots: A Large-Scale Analysis of Tweets Before and After the Launch of ChatGPT

Lewis, Seth C., Markowitz, David M., Bunquin, Jon Benedik

arXiv.org Artificial Intelligence

As part of a broader look at the impact of generative AI, this study investigated the emotional responses of journalists to the release of ChatGPT at the time of its launch. By analyzing nearly 1 million Tweets from journalists at major U.S. news outlets, we tracked changes in emotional tone and sentiment before and after the introduction of ChatGPT in November 2022. Using various computational and natural language processing techniques to measure emotional shifts in response to ChatGPT's release, we found an increase in positive emotion and a more favorable tone post-launch, suggesting initial optimism toward AI's potential. This research underscores the pivotal role of journalists as interpreters of technological innovation and disruption, highlighting how their emotional reactions may shape public narratives around emerging technologies. The study contributes to understanding the intersection of journalism, emotion, and AI, offering insights into the broader societal impact of generative AI tools.


Context is Important in Depressive Language: A Study of the Interaction Between the Sentiments and Linguistic Markers in Reddit Discussions

Sharma, Neha, Sirts, Kairit

arXiv.org Artificial Intelligence

Research exploring linguistic markers in individuals with depression has demonstrated that language usage can serve as an indicator of mental health. This study investigates the impact of discussion topic as context on linguistic markers and emotional expression in depression, using a Reddit dataset to explore interaction effects. Contrary to common findings, our sentiment analysis revealed a broader range of emotional intensity in depressed individuals, with both higher negative and positive sentiments than controls. This pattern was driven by posts containing no emotion words, revealing the limitations of the lexicon based approaches in capturing the full emotional context. We observed several interesting results demonstrating the importance of contextual analyses. For instance, the use of 1st person singular pronouns and words related to anger and sadness correlated with increased positive sentiments, whereas a higher rate of present-focused words was associated with more negative sentiments. Our findings highlight the importance of discussion contexts while interpreting the language used in depression, revealing that the emotional intensity and meaning of linguistic markers can vary based on the topic of discussion.


Can Social Ontological Knowledge Representations be Measured Using Machine Learning?

Izzidien, Ahmed

arXiv.org Artificial Intelligence

Personal Social Ontology (PSO), it is proposed, is how an individual perceives the ontological properties of terms. For example, an absolute fatalist would arguably use terms that remove any form of agency from a person. Such fatalism has the impact of ontologically defining acts such as winning, victory and success in a manner that is contrary to how a non-fatalist would ontologically define them. While both the said fatalist and non-fatalist would agree on the dictionary definition of these terms, they would differ on specifically how they can be brought about. This difference between the two individuals can be induced from their usage of these terms, i.e., the co-occurrence of these terms with other terms. As such a quantification of this such co-occurrence offers an avenue to characterise the social ontological views of the speaker. In this paper we ask, what specific term co-occurrence should be measured in order to obtain a valid and reliable psychometric measure of a persons social ontology? We consider the social psychology and social neuroscience literature to arrive at a list of social concepts that can be considered principal features of personal social ontology, and then propose an NLP pipeline to capture the articulation of these terms in language.


From Complexity to Clarity: How AI Enhances Perceptions of Scientists and the Public's Understanding of Science

Markowitz, David M.

arXiv.org Artificial Intelligence

This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public's understanding of science. By comparing lay summaries of journal articles from PNAS, yoked to those generated by AI, this work first assessed linguistic simplicity across such summaries and public perceptions. Study 1a analyzed simplicity features of PNAS abstracts (scientific summaries) and significance statements (lay summaries), observing that lay summaries were indeed linguistically simpler, but effect size differences were small. Study 1b used a large language model, GPT-4, to create significance statements based on paper abstracts and this more than doubled the average effect size without fine-tuning. Study 2 experimentally demonstrated that simply-written GPT summaries facilitated more favorable perceptions of scientists (they were perceived as more credible and trustworthy, but less intelligent) than more complexly-written human PNAS summaries. Crucially, Study 3 experimentally demonstrated that participants comprehended scientific writing better after reading simple GPT summaries compared to complex PNAS summaries. In their own words, participants also summarized scientific papers in a more detailed and concrete manner after reading GPT summaries compared to PNAS summaries of the same article. AI has the potential to engage scientific communities and the public via a simple language heuristic, advocating for its integration into scientific dissemination for a more informed society.


A Linguistic Comparison between Human and ChatGPT-Generated Conversations

Sandler, Morgan, Choung, Hyesun, Ross, Arun, David, Prabu

arXiv.org Artificial Intelligence

This study explores linguistic differences between human and LLM-generated dialogues, using 19.5K dialogues generated by ChatGPT-3.5 as a companion to the EmpathicDialogues dataset. The research employs Linguistic Inquiry and Word Count (LIWC) analysis, comparing ChatGPT-generated conversations with human conversations across 118 linguistic categories. Results show greater variability and authenticity in human dialogues, but ChatGPT excels in categories such as social processes, analytical style, cognition, attentional focus, and positive emotional tone, reinforcing recent findings of LLMs being "more human than human." However, no significant difference was found in positive or negative affect between ChatGPT and human dialogues. Classifier analysis of dialogue embeddings indicates implicit coding of the valence of affect despite no explicit mention of affect in the conversations. The research also contributes a novel, companion ChatGPT-generated dataset of conversations between two independent chatbots, which were designed to replicate a corpus of human conversations available for open access and used widely in AI research on language modeling. Our findings increase understanding of ChatGPT's linguistic capabilities and inform ongoing efforts to distinguish between human and LLM-generated text, which is critical in detecting AI-generated fakes, misinformation, and disinformation.


PsyAttention: Psychological Attention Model for Personality Detection

Zhang, Baohua, Huang, Yongyi, Cui, Wenyao, Zhang, Huaping, Shang, Jianyun

arXiv.org Artificial Intelligence

Work on personality detection has tended to incorporate psychological features from different personality models, such as BigFive and MBTI. There are more than 900 psychological features, each of which is helpful for personality detection. However, when used in combination, the application of different calculation standards among these features may result in interference between features calculated using distinct systems, thereby introducing noise and reducing performance. This paper adapts different psychological models in the proposed PsyAttention for personality detection, which can effectively encode psychological features, reducing their number by 85%. In experiments on the BigFive and MBTI models, PysAttention achieved average accuracy of 65.66% and 86.30%, respectively, outperforming state-of-the-art methods, indicating that it is effective at encoding psychological features.


An Attention-Based Denoising Framework for Personality Detection in Social Media Texts

Tang, Qirui, Jiang, Wenkang, Du, Yihua, Lin, Lei

arXiv.org Artificial Intelligence

In social media networks, users produce a large amount of text content anytime, providing researchers with a valuable approach to digging for personality-related information. Personality detection based on user-generated texts is a universal method that can be used to build user portraits. The presence of noise in social media texts hinders personality detection. However, previous studies have not fully addressed this challenge. Inspired by the scanning reading technique, we propose an attention-based information extraction mechanism (AIEM) for long texts, which is applied to quickly locate valuable pieces of information, and focus more attention on the deep semantics of key pieces. Then, we provide a novel attention-based denoising framework (ADF) for personality detection tasks and achieve state-of-the-art performance on two commonly used datasets. Notably, we obtain an average accuracy improvement of 10.2% on the gold standard Twitter-Myers-Briggs Type Indicator (Twitter-MBTI) dataset. We made our code publicly available on GitHub. We shed light on how AIEM works to magnify personality-related signals.


Exploring Linguistic Style Matching in Online Communities: The Role of Social Context and Conversation Dynamics

Ananthasubramaniam, Aparna, Chen, Hong, Yan, Jason, Alkiek, Kenan, Pei, Jiaxin, Seth, Agrima, Dunagan, Lavinia, Choi, Minje, Litterer, Benjamin, Jurgens, David

arXiv.org Artificial Intelligence

Linguistic style matching (LSM) in conversations can be reflective of several aspects of social influence such as power or persuasion. However, how LSM relates to the outcomes of online communication on platforms such as Reddit is an unknown question. In this study, we analyze a large corpus of two-party conversation threads in Reddit where we identify all occurrences of LSM using two types of style: the use of function words and formality. Using this framework, we examine how levels of LSM differ in conversations depending on several social factors within Reddit: post and subreddit features, conversation depth, user tenure, and the controversiality of a comment. Finally, we measure the change of LSM following loss of status after community banning. Our findings reveal the interplay of LSM in Reddit conversations with several community metrics, suggesting the importance of understanding conversation engagement when understanding community dynamics.