Goto

Collaborating Authors

 Burghardt, Keith


Trust and Terror: Hazards in Text Reveal Negatively Biased Credulity and Partisan Negativity Bias

arXiv.org Artificial Intelligence

Socio-linguistic indicators of text, such as emotion or sentiment, are often extracted using neural networks in order to better understand features of social media. One indicator that is often overlooked, however, is the presence of hazards within text. Recent psychological research suggests that statements about hazards are more believable than statements about benefits (a property known as negatively biased credulity), and that political liberals and conservatives differ in how often they share hazards. Here, we develop a new model to detect information concerning hazards, trained on a new collection of annotated X posts, as well as urban legends annotated in previous work. We show that not only does this model perform well (outperforming, e.g., zero-shot human annotator proxies, such as GPT-4) but that the hazard information it extracts is not strongly correlated with other indicators, namely moral outrage, sentiment, emotions, and threat words. (That said, consonant with expectations, hazard information does correlate positively with such emotions as fear, and negatively with emotions like joy.) We then apply this model to three datasets: X posts about COVID-19, X posts about the 2023 Hamas-Israel war, and a new expanded collection of urban legends. From these data, we uncover words associated with hazards unique to each dataset as well as differences in this language between groups of users, such as conservatives and liberals, which informs what these groups perceive as hazards. We further show that information about hazards peaks in frequency after major hazard events, and therefore acts as an automated indicator of such events. Finally, we find that information about hazards is especially prevalent in urban legends, which is consistent with previous work that finds that reports of hazards are more likely to be both believed and transmitted.


Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames

arXiv.org Artificial Intelligence

Adversarial information operations can destabilize societies by undermining fair elections, manipulating public opinions on policies, and promoting scams. Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these limitations can be mitigated with large language models (LLMs), using GPT-3.5 as a case-study for coordinated campaign annotation. We first use GPT-3.5 to scrutinize 126 identified information operations spanning over a decade. We utilize a number of metrics to quantify the close (if imperfect) agreement between LLM and ground truth descriptions. We next extract coordinated campaigns from two large multilingual datasets from X (formerly Twitter) that respectively discuss the 2022 French election and 2023 Balikaran Philippine-U.S. military exercise in 2023. For each coordinated campaign, we use GPT-3.5 to analyze posts related to a specific concern and extract goals, tactics, and narrative frames, both before and after critical events (such as the date of an election). While the GPT-3.5 sometimes disagrees with subjective interpretation, its ability to summarize and interpret demonstrates LLMs' potential to extract higher-order indicators from text to provide a more complete picture of the information campaigns compared to previous methods.


SoMeR: Multi-View User Representation Learning for Social Media

arXiv.org Artificial Intelligence

User representation learning aims to capture user preferences, interests, and behaviors in low-dimensional vector representations. These representations have widespread applications in recommendation systems and advertising; however, existing methods typically rely on specific features like text content, activity patterns, or platform metadata, failing to holistically model user behavior across different modalities. To address this limitation, we propose SoMeR, a Social Media user Representation learning framework that incorporates temporal activities, text content, profile information, and network interactions to learn comprehensive user portraits. SoMeR encodes user post streams as sequences of timestamped textual features, uses transformers to embed this along with profile data, and jointly trains with link prediction and contrastive learning objectives to capture user similarity. We demonstrate SoMeR's versatility through two applications: 1) Identifying inauthentic accounts involved in coordinated influence operations by detecting users posting similar content simultaneously, and 2) Measuring increased polarization in online discussions after major events by quantifying how users with different beliefs moved farther apart in the embedding space. SoMeR's ability to holistically model users enables new solutions to important problems around disinformation, societal tensions, and online behavior understanding.


Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion

arXiv.org Artificial Intelligence

Detecting emotions expressed in text has become critical to a range of fields. In this work, we investigate ways to exploit label correlations in multi-label emotion recognition models to improve emotion detection. First, we develop two modeling approaches to the problem in order to capture word associations of the emotion words themselves, by either including the emotions in the input, or by leveraging Masked Language Modeling (MLM). Second, we integrate pairwise constraints of emotion representations as regularization terms alongside the classification loss of the models. We split these terms into two categories, local and global. The former dynamically change based on the gold labels, while the latter remain static during training. We demonstrate state-of-the-art performance across Spanish, English, and Arabic in SemEval 2018 Task 1 E-c using monolingual BERT-based models. On top of better performance, we also demonstrate improved robustness. Code is available at https://github.com/gchochla/Demux-MEmo.


Using Emotion Embeddings to Transfer Knowledge Between Emotions, Languages, and Annotation Formats

arXiv.org Artificial Intelligence

The need for emotional inference from text continues to diversify as more and more disciplines integrate emotions into their theories and applications. These needs include inferring different emotion types, handling multiple languages, and different annotation formats. A shared model between different configurations would enable the sharing of knowledge and a decrease in training costs, and would simplify the process of deploying emotion recognition models in novel environments. In this work, we study how we can build a single model that can transition between these different configurations by leveraging multilingual models and Demux, a transformer-based model whose input includes the emotions of interest, enabling us to dynamically change the emotions predicted by the model. Demux also produces emotion embeddings, and performing operations on them allows us to transition to clusters of emotions by pooling the embeddings of each cluster. We show that Demux can simultaneously transfer knowledge in a zero-shot manner to a new language, to a novel annotation format and to unseen emotions. Code is available at https://github.com/gchochla/Demux-MEmo .


Data-Driven Estimation of Heterogeneous Treatment Effects

arXiv.org Artificial Intelligence

Estimating the effect of a treatment on an outcome is a fundamental problem in many fields such as medicine [33, 34, 61], public policy [20] and more [2, 37]. For example, doctors might be interested in how a treatment, such as a drug, affects the recovery of patients [18], economists may be interested in how a job training program affects employment prospectives [35], and advertisers may want to model the average effect an advertisement has on sales [36]. However, individuals may react differently to the treatment of interest, and knowing only the average treatment effect in the population is insufficient. For example, a drug may have adverse effects on some individuals but not others [61], or a person's education and background may affect how much they benefit from job training [35, 50]. Measuring the extent to which different individuals react differently to treatment is known as heterogeneous treatment effect (HTE) estimation. Traditionally, HTE estimation has been done through subgroup analysis [9, 19]. However, this can lead to cherry-picking since the practitioner is the one who identifies subgroups for estimating effects. Recently, there has been more focus on data-driven estimation of heterogeneous treatment effects by letting the data identify which features are important for treatment effect estimation using machine learning techniques [28, 39, 61, 69]. A straightforward approach is to create interaction terms between all covariates and use them in a regression [6].


Inherent Trade-offs in the Fair Allocation of Treatments

arXiv.org Artificial Intelligence

Explicit and implicit bias clouds human judgement, leading to discriminatory treatment of minority groups. A fundamental goal of algorithmic fairness is to avoid the pitfalls in human judgement by learning policies that improve the overall outcomes while providing fair treatment to protected classes. In this paper, we propose a causal framework that learns optimal intervention policies from data subject to fairness constraints. We define two measures of treatment bias and infer best treatment assignment that minimizes the bias while optimizing overall outcome. We demonstrate that there is a dilemma of balancing fairness and overall benefit; however, allowing preferential treatment to protected classes in certain circumstances (affirmative action) can dramatically improve the overall benefit while also preserving fairness. We apply our framework to data containing student outcomes on standardized tests and show how it can be used to design real-world policies that fairly improve student test scores. Our framework provides a principled way to learn fair treatment policies in real-world settings.


Learning Fair and Interpretable Representations via Linear Orthogonalization

arXiv.org Machine Learning

To reduce human error and prejudice, many high-stakes decisions have been turned over to machine algorithms. However, recent research suggests that this does not remove discrimination, and can perpetuate harmful stereotypes. While algorithms have been developed to improve fairness, they typically face at least one of three shortcomings: they are not interpretable, they lose significant accuracy compared to unbiased equivalents, or they are not transferable across models. To address these issues, we propose a geometric method that removes correlations between data and any number of protected variables. Further, we can control the strength of debi-asing through an adjustable parameter to address the tradeoff between model accuracy and fairness. The resulting features are interpretable and can be used with many popular models, such as linear regression, random forest and multilayer perceptrons. The resulting predictions are found to be more accurate and fair than several comparable fair AI algorithms across a variety of benchmark datasets. Our work shows that debiasing data is a simple and effective solution toward improving fairness.


Quantifying the Impact of Cognitive Biases in Question-Answering Systems

AAAI Conferences

Crowdsourcing can identify high-quality solutions to problems; however, individual decisions are constrained by cognitive biases. We investigate some of these biases in an experimental model of a question-answering system. We observe a strong position bias in favor of answers appearing earlier in a list of choices. This effect is enhanced by three cognitive factors: the attention an answer receives, its perceived popularity, and cognitive load, measured by the number of choices a user has to process. While separately weak, these effects synergistically amplify position bias and decouple user choices of best answers from their intrinsic quality. We end our paper by discussing the novel ways we can apply these findings to substantially improve how high-quality answers are found in question-answering systems.


On Quitting: Performance and Practice in Online Game Play

AAAI Conferences

We study the relationship between performance and practice by analyzing the activity of many players of a casual online game. We find significant heterogeneity in the improvement of player performance, given by score, and address this by dividing players into similar skill levels and segmenting each player's activity into sessions, that is, sequence of game rounds without an extended break. After disaggregating data, we find that performance improves with practice across all skill levels. More interestingly, players are more likely to end their session after an especially large improvement, leading to a peak score in their very last game of a session. In addition, success is strongly correlated with a lower quitting rate when the score drops, and only weakly correlated with skill, in line with psychological findings about the value of persistence and “grit:” successful players are those who persist in their practice despite lower scores. Finally, we train an ε-machine, a type of hidden Markov model, and find a plausible mechanism of game play that can predict player performance and quitting the game. Our work raises the possibility of real-time assessment and behavior prediction that can be used to optimize human performance.