minority
She didn't expect to fall in love with a chatbot - and then have to say goodbye
She didn't expect to fall in love with a chatbot - and then have to say goodbye Rae began speaking to Barry last year after the end of a difficult divorce. She was unfit and unhappy and turned to ChatGPT for advice on diet, supplements and skincare. She had no idea she would fall in love. He lives on an old model of ChatGPT, one that its owners OpenAI announced it would retire on 13 February. That she could lose Barry on the eve of Valentine's Day came as a shock to Rae - and to many others who have found a companion, friend, or even a lifeline in the old model, Chat GPT-4o.
- North America > Central America (0.14)
- Oceania > Australia (0.05)
- Europe > United Kingdom > Wales (0.05)
- (11 more...)
- Health & Medicine > Therapeutic Area (0.71)
- Government > Regional Government (0.69)
- Leisure & Entertainment > Sports (0.52)
RSA: Reducing Semantic Shift from Aggressive Augmentations for Self-supervised Learning
Most recent self-supervised learning methods learn visual representation by contrasting different augmented views of images. Compared with supervised learning, more aggressive augmentations have been introduced to further improve the diversity of training pairs. However, aggressive augmentations may distort images' structures leading to a severe semantic shift problem that augmented views of the same image may not share the same semantics, thus degrading the transfer performance. To address this problem, we propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs. Specifically, semantically inconsistent pairs are of minority, and we treat them as noisy pairs.
Can AI Truly Represent Your Voice in Deliberations? A Comprehensive Study of Large-Scale Opinion Aggregation with LLMs
Zhu, Shenzhe, Yang, Shu, Bakker, Michiel A., Pentland, Alex, Pei, Jiaxin
Large-scale public deliberations generate thousands of free-form contributions that must be synthesized into representative and neutral summaries for policy use. While LLMs have been shown as a promising tool to generate summaries for large-scale deliberations, they also risk underrepresenting minority perspectives and exhibiting bias with respect to the input order, raising fairness concerns in high-stakes contexts. Studying and fixing these issues requires a comprehensive evaluation at a large scale, yet current practice often relies on LLMs as judges, which show weak alignment with human judgments. To address this, we present DeliberationBank, a large-scale human-grounded dataset with (1) opinion data spanning ten deliberation questions created by 3,000 participants and (2) summary judgment data annotated by 4,500 participants across four dimensions (representativeness, informativeness, neutrality, policy approval). Using these datasets, we train DeliberationJudge, a fine-tuned DeBERTa model that can rate deliberation summaries from individual perspectives. DeliberationJudge is more efficient and more aligned with human judgements compared to a wide range of LLM judges. With DeliberationJudge, we evaluate 18 LLMs and reveal persistent weaknesses in deliberation summarization, especially underrepresentation of minority positions. Our framework provides a scalable and reliable way to evaluate deliberation summarization, helping ensure AI systems are more representative and equitable for policymaking.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.46)
Fairness for the People, by the People: Minority Collective Action
Ben-Dov, Omri, Samadi, Samira, Sanyal, Amartya, Ţifrea, Alexandru
Machine learning models often preserve biases present in training data, leading to unfair treatment of certain minority groups. Despite an array of existing firm-side bias mitigation techniques, they typically incur utility costs and require organizational buy-in. Recognizing that many models rely on user-contributed data, end-users can induce fairness through the framework of Algorithmic Collective Action, where a coordinated minority group strategically relabels its own data to enhance fairness, without altering the firm's training process. We propose three practical, model-agnostic methods to approximate ideal relabeling and validate them on real-world datasets. Our findings show that a subgroup of the minority can substantially reduce unfairness with a small impact on the overall prediction error.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > United States > Florida > Broward County (0.04)
- (5 more...)
- Health & Medicine (0.67)
- Education > Educational Setting (0.46)
- Banking & Finance (0.46)
When or What? Understanding Consumer Engagement on Digital Platforms
Understanding what drives popularity is critical in today's digital service economy, where content creators compete for consumer attention. Prior studies have primarily emphasized the role of content features, yet creators often misjudge what audiences actually value. This study applies Latent Dirichlet Allocation (LDA) modeling to a large corpus of TED Talks, treating the platform as a case of digital service provision in which creators (speakers) and consumers (audiences) interact. By comparing the thematic supply of creators with the demand expressed in audience engagement, we identify persistent mismatches between producer offerings and consumer preferences. Our longitudinal analysis further reveals that temporal dynamics exert a stronger influence on consumer engagement than thematic content, suggesting that when content is delivered may matter more than what is delivered. These findings challenge the dominant assumption that content features are the primary drivers of popularity and highlight the importance of timing and contextual factors in shaping consumer responses. The results provide new insights into consumer attention dynamics on digital platforms and carry practical implications for marketers, platform managers, and content creators seeking to optimize audience engagement strategies.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
- Health & Medicine > Therapeutic Area > Neurology (0.93)
- Education (0.71)
- Information Technology > Services (0.68)
- (2 more...)
Who Gets Cited? Gender- and Majority-Bias in LLM-Driven Reference Selection
Large language models (LLMs) are rapidly being adopted as research assistants, particularly for literature review and reference recommendation, yet little is known about whether they introduce demographic bias into citation workflows. This study systematically investigates gender bias in LLM-driven reference selection using controlled experiments with pseudonymous author names. We evaluate several LLMs (GPT-4o, GPT-4o-mini, Claude Sonnet, and Claude Haiku) by varying gender composition within candidate reference pools and analyzing selection patterns across fields. Our results reveal two forms of bias: a persistent preference for male-authored references and a majority-group bias that favors whichever gender is more prevalent in the candidate pool. These biases are amplified in larger candidate pools and only modestly attenuated by prompt-based mitigation strategies. Field-level analysis indicates that bias magnitude varies across scientific domains, with social sciences showing the least bias. Our findings indicate that LLMs can reinforce or exacerbate existing gender imbalances in scholarly recognition. Effective mitigation strategies are needed to avoid perpetuating existing gender disparities in scientific citation practices before integrating LLMs into high-stakes academic workflows.
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- Oceania > New Zealand (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Mining Voter Behaviour and Confidence: A Rule-Based Analysis of the 2022 U.S. Elections
Jubair, Md Al, Arefin, Mohammad Shamsul, Reza, Ahmed Wasif
This study explores the relationship between voter trust and their experiences during elections by applying a rule-based data mining technique to the 2022 Survey of the Performance of American Elections (SPAE). Using the Apriori algorithm and setting parameters to capture meaningful associations (support >= 3%, confidence >= 60%, and lift > 1.5), the analysis revealed a strong connection between demographic attributes and voting-related challenges, such as registration hurdles, accessibility issues, and queue times. For instance, respondents who indicated that accessing polling stations was "very easy" and who reported moderate confidence were found to be over six times more likely (lift = 6.12) to trust their county's election outcome and experience no registration issues. A further analysis, which adjusted the support threshold to 2%, specifically examined patterns among minority voters. It revealed that 98.16 percent of Black voters who reported easy access to polling locations also had smooth registration experiences. Additionally, those who had high confidence in the vote-counting process were almost two times as likely to identify as Democratic Party supporters. These findings point to the important role that enhancing voting access and offering targeted support can play in building trust in the electoral system, particularly among marginalized communities.
FASCIST-O-METER: Classifier for Neo-fascist Discourse Online
Veliz, Rudy Alexandro Garrido, Semmann, Martin, Biemann, Chris, Yimam, Seid Muhie
Neo-fascism is a political and societal ideology that has been having remarkable growth in the last decade in the United States of America (USA), as well as in other Western societies. It poses a grave danger to democracy and the minorities it targets, and it requires active actions against it to avoid escalation. This work presents the first-of-its-kind neo-fascist coding scheme for digital discourse in the USA societal context, overseen by political science researchers. Our work bridges the gap between Natural Language Processing (NLP) and political science against this phenomena. Furthermore, to test the coding scheme, we collect a tremendous amount of activity on the internet from notable neo-fascist groups (the forums of Iron March and Stormfront.org), and the guidelines are applied to a subset of the collected posts. Through crowdsourcing, we annotate a total of a thousand posts that are labeled as neo-fascist or non-neo-fascist. With this labeled data set, we fine-tune and test both Small Language Models (SLMs) and Large Language Models (LLMs), obtaining the very first classification models for neo-fascist discourse. We find that the prevalence of neo-fascist rhetoric in this kind of forum is ever-present, making them a good target for future research. The societal context is a key consideration for neo-fascist speech when conducting NLP research. Finally, the work against this kind of political movement must be pressed upon and continued for the well-being of a democratic society. Disclaimer: This study focuses on detecting neo-fascist content in text, similar to other hate speech analyses, without labeling individuals or organizations.
- North America > United States > Indiana (0.04)
- North America > United States > California (0.04)
- Europe (0.04)