moral sentiment
The Moral Foundations Weibo Corpus
Cao, Renjie, Hu, Miaoyan, Wei, Jiahan, Ihnaini, Baha
Moral sentiments expressed in natural language significantly influence both online and offline environments, shaping behavioral styles and interaction patterns, including social media selfpresentation, cyberbullying, adherence to social norms, and ethical decision-making. To effectively measure moral sentiments in natural language processing texts, it is crucial to utilize large, annotated datasets that provide nuanced understanding for accurate analysis and modeltraining. However, existing corpora, while valuable, often face linguistic limitations. To address this gap in the Chinese language domain,we introduce the Moral Foundation Weibo Corpus. This corpus consists of 25,671 Chinese comments on Weibo, encompassing six diverse topic areas. Each comment is manually annotated by at least three systematically trained annotators based on ten moral categories derived from a grounded theory of morality. To assess annotator reliability, we present the kappa testresults, a gold standard for measuring consistency. Additionally, we apply several the latest large language models to supplement the manual annotations, conducting analytical experiments to compare their performance and report baseline results for moral sentiment classification.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.48)
Whose Emotions and Moral Sentiments Do Language Models Reflect?
He, Zihao, Guo, Siyi, Rao, Ashwin, Lerman, Kristina
Language models (LMs) are known to represent the perspectives of some social groups better than others, which may impact their performance, especially on subjective tasks such as content moderation and hate speech detection. To explore how LMs represent different perspectives, existing research focused on positional alignment, i.e., how closely the models mimic the opinions and stances of different groups, e.g., liberals or conservatives. However, human communication also encompasses emotional and moral dimensions. We define the problem of affective alignment, which measures how LMs' emotional and moral tone represents those of different groups. By comparing the affect of responses generated by 36 LMs to the affect of Twitter messages, we observe significant misalignment of LMs with both ideological groups. This misalignment is larger than the partisan divide in the U.S. Even after steering the LMs towards specific ideological perspectives, the misalignment and liberal tendencies of the model persist, suggesting a systemic bias within LMs.
- North America > United States (0.88)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
- Asia > China > Hubei Province > Wuhan (0.04)
- Government (0.93)
- Information Technology > Services (0.88)
- Health & Medicine > Therapeutic Area > Immunology (0.75)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.52)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation
Davani, Aida Mostafazadeh, Díaz, Mark, Baker, Dylan, Prabhakaran, Vinodkumar
While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within demographic groups may hold diverse values, which can influence their perceptions beyond their group norms. To effectively incorporate these considerations into NLP pipelines, we need datasets with extensive parallel annotations from various social and cultural groups. In this paper we introduce the \dataset dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries, representing eight geo-cultural regions. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values, offering crucial insights for building pluralistic, culturally sensitive NLP models.
- Asia > Middle East > Qatar (0.14)
- Asia > Middle East > UAE (0.14)
- Africa > Sub-Saharan Africa (0.05)
- (20 more...)
- Health & Medicine (0.46)
- Law > Civil Rights & Constitutional Law (0.46)
The Moral Foundations Reddit Corpus
Trager, Jackson, Ziabari, Alireza S., Davani, Aida Mostafazadeh, Golazizian, Preni, Karimi-Malekabadi, Farzan, Omrani, Ali, Li, Zhihe, Kennedy, Brendan, Reimer, Nils Karl, Reyes, Melissa, Cheng, Kelsey, Wei, Mellow, Merrifield, Christina, Khosravi, Arta, Alvarez, Evans, Dehghani, Morteza
Moral framing and sentiment can affect a variety of online and offline behaviors, including donation, pro-environmental action, political engagement, and even participation in violent protests. Various computational methods in Natural Language Processing (NLP) have been used to detect moral sentiment from textual data, but in order to achieve better performances in such subjective tasks, large sets of hand-annotated training data are needed. Previous corpora annotated for moral sentiment have proven valuable, and have generated new insights both within NLP and across the social sciences, but have been limited to Twitter. To facilitate improving our understanding of the role of moral rhetoric, we present the Moral Foundations Reddit Corpus, a collection of 16,123 Reddit comments that have been curated from 12 distinct subreddits, hand-annotated by at least three trained annotators for 8 categories of moral sentiment (i.e., Care, Proportionality, Equality, Purity, Authority, Loyalty, Thin Morality, Implicit/Explicit Morality) based on the updated Moral Foundations Theory (MFT) framework. We use a range of methodologies to provide baseline moral-sentiment classification results for this new corpus, e.g., cross-domain classification and knowledge transfer.
- Europe > France (0.47)
- North America > United States > California (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Africa > Middle East > Djibouti > Arta > `Arta (0.04)
- Law (1.00)
- Information Technology > Services (1.00)
- Media > News (0.93)
- (3 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.34)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)
Identifying Morality Frames in Political Tweets using Relational Learning
Roy, Shamik, Pacheco, Maria Leonor, Goldwasser, Dan
Extracting moral sentiment from text is a vital component in understanding public opinion, social movements, and policy decisions. The Moral Foundation Theory identifies five moral foundations, each associated with a positive and negative polarity. However, moral sentiment is often motivated by its targets, which can correspond to individuals or collective entities. In this paper, we introduce morality frames, a representation framework for organizing moral attitudes directed at different entities, and come up with a novel and high-quality annotated dataset of tweets written by US politicians. Then, we propose a relational learning model to predict moral attitudes towards entities and moral foundations jointly. We do qualitative and quantitative evaluations, showing that moral sentiment towards entities differs highly across political ideologies.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Beijing > Beijing (0.04)
- Europe > Germany > Berlin (0.04)
- (18 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)