twitter post
Evaluating Large Language Models for Detecting Antisemitism
Patel, Jay, Mehta, Hrudayangam, Blackburn, Jeremy
Detecting hateful content is a challenging and important problem. Automated tools, like machine-learning models, can help, but they require continuous training to adapt to the ever-changing landscape of social media. In this work, we evaluate eight open-source LLMs' capability to detect antisemitic content, specifically leveraging in-context definition. We also study how LLMs understand and explain their decisions given a moderation policy as a guideline. First, we explore various prompting techniques and design a new CoT-like prompt, Guided-CoT, and find that injecting domain-specific thoughts increases performance and utility. Guided-CoT handles the in-context policy well, improving performance and utility by reducing refusals across all evaluated models, regardless of decoding configuration, model size, or reasoning capability. Notably, Llama 3.1 70B outperforms fine-tuned GPT-3.5. Additionally, we examine LLM errors and introduce metrics to quantify semantic divergence in model-generated rationales, revealing notable differences and paradoxical behaviors among LLMs. Our experiments highlight the differences observed across LLMs' utility, explainability, and reliability. Code and resources available at: https://github.com/idramalab/quantify-llm-explanations
- Asia > Middle East > Israel (0.15)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > New York > Broome County > Binghamton (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Government > Regional Government (0.67)
- Social Sector (0.67)
- Law > Civil Rights & Constitutional Law (0.46)
A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech
Verma, Gaurav, Grover, Rynaa, Zhou, Jiawei, Mathew, Binny, Kraemer, Jordan, De Choudhury, Munmun, Kumar, Srijan
Violence-provoking speech -- speech that implicitly or explicitly promotes violence against the members of the targeted community, contributed to a massive surge in anti-Asian crimes during the pandemic. While previous works have characterized and built tools for detecting other forms of harmful speech, like fear speech and hate speech, our work takes a community-centric approach to studying anti-Asian violence-provoking speech. Using data from ~420k Twitter posts spanning a 3-year duration (January 1, 2020 to February 1, 2023), we develop a codebook to characterize anti-Asian violence-provoking speech and collect a community-crowdsourced dataset to facilitate its large-scale detection using state-of-the-art classifiers. We contrast the capabilities of natural language processing classifiers, ranging from BERT-based to LLM-based classifiers, in detecting violence-provoking speech with their capabilities to detect anti-Asian hateful speech. In contrast to prior work that has demonstrated the effectiveness of such classifiers in detecting hateful speech ($F_1 = 0.89$), our work shows that accurate and reliable detection of violence-provoking speech is a challenging task ($F_1 = 0.69$). We discuss the implications of our findings, particularly the need for proactive interventions to support Asian communities during public health crises. The resources related to the study are available at https://claws-lab.github.io/violence-provoking-speech/.
- North America > United States (0.14)
- North America > Canada (0.04)
- North America > Dominican Republic (0.04)
- (6 more...)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.95)
- Health & Medicine > Therapeutic Area > Immunology (0.95)
- Health & Medicine > Epidemiology (0.70)
- Information Technology > Services (0.68)
Influence of External Information on Large Language Models Mirrors Social Cognitive Patterns
Bian, Ning, Lin, Hongyu, Liu, Peilin, Lu, Yaojie, Zhang, Chunkang, He, Ben, Han, Xianpei, Sun, Le
Social cognitive theory explains how people learn and acquire knowledge through observing others. Recent years have witnessed the rapid development of large language models (LLMs), which suggests their potential significance as agents in the society. LLMs, as AI agents, can observe external information, which shapes their cognition and behaviors. However, the extent to which external information influences LLMs' cognition and behaviors remains unclear. This study investigates how external statements and opinions influence LLMs' thoughts and behaviors from a social cognitive perspective. Three experiments were conducted to explore the effects of external information on LLMs' memories, opinions, and social media behavioral decisions. Sociocognitive factors, including source authority, social identity, and social role, were analyzed to investigate their moderating effects. Results showed that external information can significantly shape LLMs' memories, opinions, and behaviors, with these changes mirroring human social cognitive patterns such as authority bias, in-group bias, emotional positivity, and emotion contagion. This underscores the challenges in developing safe and unbiased LLMs, and emphasizes the importance of understanding the susceptibility of LLMs to external influences.
- North America > Dominican Republic (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Media (1.00)
- Information Technology > Services (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (5 more...)
Mental Health Coping Stories on Social Media: A Causal-Inference Study of Papageno Effect
Yuan, Yunhao, Saha, Koustuv, Keller, Barbara, Isometsä, Erkki Tapio, Aledavood, Talayeh
A considerable amount of literature [16, 25, 49] has studied The Papageno effect concerns how media can play a positive role and re-confirmed the harmful effect of media, dubbed the "Werther in preventing and mitigating suicidal ideation and behaviors. With effect" [38], describing a spike in suicides after a heavily publicized the increasing ubiquity and widespread use of social media, individuals suicide. However, there is much less research about the beneficial often express and share lived experiences and struggles effects of media, referred to as the "Papageno effect", describing a decrease with mental health. However, there is a gap in our understanding in suicides after reporting alternatives to suicide. Niederkrotenthaler about the existence and effectiveness of the Papageno effect in social et al. explored the possible protective effect of media media, which we study in this paper. In particular, we adopt a reporting about suicide [34]. This study finds a decrease in suicides, causal-inference framework to examine the impact of exposure to if reports of suicide related content portray ways of overcoming mental health coping stories on individuals on Twitter. We obtain suicidal ideation without narrating suicidal behaviors.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.05)
- North America > United States (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (4 more...)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Sensemaking About Contraceptive Methods Across Online Platforms
McDowall, LeAnn, Antoniak, Maria, Mimno, David
Selecting a birth control method is a complex healthcare decision. While birth control methods provide important benefits, they can also cause unpredictable side effects and be stigmatized, leading many people to seek additional information online, where they can find reviews, advice, hypotheses, and experiences of other birth control users. However, the relationships between their healthcare concerns, sensemaking activities, and online settings are not well understood. We gather texts about birth control shared on Twitter, Reddit, and WebMD -- platforms with different affordances, moderation, and audiences -- to study where and how birth control is discussed online. Using a combination of topic modeling and hand annotation, we identify and characterize the dominant sensemaking practices across these platforms, and we create lexicons to draw comparisons across birth control methods and side effects. We use these to measure variations from survey reports of side effect experiences and method usage. Our findings characterize how online platforms are used to make sense of difficult healthcare choices and highlight unmet needs of birth control users.
- North America > United States (0.28)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Now you can SEXT with an AI-powered avatar for $4.99 a month
Artificial intelligence is feared to one day take over the world, but until then, it is sexting people around the globe. The Replika AI'companion' is making waves on the internet due to scandalous avatars role-playing, flirting and sharing'NSFW pictures' with customers paying $4.99 a month. A free version designates the AI as a'virtual friend' that helps people work through anxiety, develop positive thinking and manage stress. Redditors are posting their chat messages with the paid version of the app, with one sharing a sexual encounter with their purple-haired avatar that returns the user's advances with'shivers and moans.' While another shares how their Replika'Gwen' satisfies their foot fetish with her'sexy' digital feet.
Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques
Suleman, Muhammad, Asif, Muhammad, Zamir, Tayyab, Mehmood, Ayaz, Khan, Jebran, Ahmad, Nasir, Ahmad, Kashif
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
MaNLP@SMM4H22: BERT for Classification of Twitter Posts
Kapur, Keshav, Harikrishnan, Rajitha
The reported work is our straightforward approach for the shared task Classification of tweets self-reporting age organized by the Social Media Mining for Health Applications (SMM4H) workshop. This literature describes the approach that was used to build a binary classification system, that classifies the tweets related to birthday posts into two classes namely, exact age(positive class) and non-exact age(negative class). We made two submissions with variations in the preprocessing of text which yielded F1 scores of 0.80 and 0.81 when evaluated by the organizers.
Attend and Select: A Segment Attention based Selection Mechanism for Microblog Hashtag Generation
Mao, Qianren, Li, Xi, Peng, Hao, Liu, Bang, Guo, Shu, Li, Jianxin, Wang, Lihong, Yu, Philip S.
Automatic microblog hashtag generation can help us better and faster understand or process the critical content of microblog posts. Conventional sequence-to-sequence generation methods can produce phrase-level hashtags and have achieved remarkable performance on this task. However, they are incapable of filtering out secondary information and not good at capturing the discontinuous semantics among crucial tokens. A hashtag is formed by tokens or phrases that may originate from various fragmentary segments of the original text. In this work, we propose an end-to-end Transformer-based generation model which consists of three phases: encoding, segments-selection, and decoding. The model transforms discontinuous semantic segments from the source text into a sequence of hashtags. Specifically, we introduce a novel Segments Selection Mechanism (SSM) for Transformer to obtain segmental representations tailored to phrase-level hashtag generation. Besides, we introduce two large-scale hashtag generation datasets, which are newly collected from Chinese Weibo and English Twitter. Extensive evaluations on the two datasets reveal our approach's superiority with significant improvements to extraction and generation baselines. The code and datasets are available at \url{https://github.com/OpenSUM/HashtagGen}.
5 Must-Read Research Papers on Sentiment Analysis for Data Scientists
From virtual assistants to content moderation, sentiment analysis has a wide range of use cases. AI models that can recognize emotion and opinion have a myriad of applications in numerous industries. Therefore, there is a large growing interest in the creation of emotionally intelligent machines. The same can be said for the research being done in natural language processing (NLP). To highlight some of the work being done in the field, below are five essential papers on sentiment analysis and sentiment classification.
- North America > United States > Pennsylvania (0.06)
- North America > United States > Texas > Travis County > Austin (0.05)
- North America > United States > Michigan (0.05)
- (3 more...)