microaggression
Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System
Large language models (LLMs) offer promising opportunities for organizational research. However, their built-in moderation systems can create problems when researchers try to analyze harmful content, often refusing to follow certain instructions or producing overly cautious responses that undermine validity of the results. This is particularly problematic when analyzing organizational conflicts such as microaggressions or hate speech. This paper introduces an Elo rating-based method that significantly improves LLM performance for harmful content analysis In two datasets, one focused on microaggression detection and the other on hate speech, we find that our method outperforms traditional LLM prompting techniques and conventional machine learning models on key measures such as accuracy, precision, and F1 scores. Advantages include better reliability when analyzing harmful content, fewer false positives, and greater scalability for large-scale datasets. This approach supports organizational applications, including detecting workplace harassment, assessing toxic communication, and fostering safer and more inclusive work environments.
- North America > United States > New York > Tompkins County > Ithaca (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- Health & Medicine (0.93)
- Leisure & Entertainment > Games > Chess (0.87)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT
Thomas, Danielle R., Borchers, Conrad, Kakarla, Sanjit, Lin, Jionghao, Bhushan, Shambhavi, Guo, Boyuan, Gatz, Erin, Koedinger, Kenneth R.
The role of multiple-choice questions (MCQs) as effective learning tools has been debated in past research. While MCQs are widely used due to their ease in grading, open response questions are increasingly used for instruction, given advances in large language models (LLMs) for automated grading. This study evaluates MCQs effectiveness relative to open-response questions, both individually and in combination, on learning. These activities are embedded within six tutor lessons on advocacy. Using a posttest-only randomized control design, we compare the performance of 234 tutors (790 lesson completions) across three conditions: MCQ only, open response only, and a combination of both. We find no significant learning differences across conditions at posttest, but tutors in the MCQ condition took significantly less time to complete instruction. These findings suggest that MCQs are as effective, and more efficient, than open response tasks for learning when practice time is limited. To further enhance efficiency, we autograded open responses using GPT-4o and GPT-4-turbo. GPT models demonstrate proficiency for purposes of low-stakes assessment, though further research is needed for broader use. This study contributes a dataset of lesson log data, human annotation rubrics, and LLM prompts to promote transparency and reproducibility.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.15)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (8 more...)
- Education > Educational Technology > Educational Software > Computer Based Training (1.00)
- Education > Educational Setting (1.00)
Queer In AI: A Case Study in Community-Led Participatory AI
QueerInAI, Organizers Of, :, null, Ovalle, Anaelia, Subramonian, Arjun, Singh, Ashwin, Voelcker, Claas, Sutherland, Danica J., Locatelli, Davide, Breznik, Eva, Klubička, Filip, Yuan, Hang, J, Hetvi, Zhang, Huan, Shriram, Jaidev, Lehman, Kruno, Soldaini, Luca, Sap, Maarten, Deisenroth, Marc Peter, Pacheco, Maria Leonor, Ryskina, Maria, Mundt, Martin, Agarwal, Milind, McLean, Nyx, Xu, Pan, Pranav, A, Korpan, Raj, Ray, Ruchira, Mathew, Sarah, Arora, Sarthak, John, ST, Anand, Tanvi, Agrawal, Vishakha, Agnew, William, Long, Yanan, Wang, Zijie J., Talat, Zeerak, Ghosh, Avijit, Dennler, Nathaniel, Noseworthy, Michael, Jha, Sharvani, Baylor, Emi, Joshi, Aditya, Bilenko, Natalia Y., McNamara, Andrew, Gontijo-Lopes, Raphael, Markham, Alex, Dǒng, Evyn, Kay, Jackie, Saraswat, Manu, Vytla, Nikhil, Stark, Luke
We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess the organization's impact. Queer in AI provides important lessons and insights for practitioners and theorists of participatory methods broadly through its rejection of hierarchy in favor of decentralization, success at building aid and programs by and for the queer community, and effort to change actors and institutions outside of the queer community. Finally, we theorize how communities like Queer in AI contribute to the participatory design in AI more broadly by fostering cultures of participation in AI, welcoming and empowering marginalized participants, critiquing poor or exploitative participatory practices, and bringing participation to institutions outside of individual research projects. Queer in AI's work serves as a case study of grassroots activism and participatory methods within AI, demonstrating the potential of community-led participatory methods and intersectional praxis, while also providing challenges, case studies, and nuanced insights to researchers developing and using participatory methods.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- South America > Colombia (0.04)
- (19 more...)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Zhou, Xuhui, Zhu, Hao, Yerukola, Akhila, Davidson, Thomas, Hwang, Jena D., Swayamdipta, Swabha, Sap, Maarten
Warning: This paper contains content that may be offensive or upsetting. Understanding the harms and offensiveness of statements requires reasoning about the social and situational context in which statements are made. For example, the utterance "your English is very good" may implicitly signal an insult when uttered by a white man to a non-white colleague, but uttered by an ESL teacher to their student would be interpreted as a genuine compliment. Such contextual factors have been largely ignored by previous approaches to toxic language detection. We introduce COBRA frames, the first context-aware formalism for explaining the intents, reactions, and harms of offensive or biased statements grounded in their social and situational context. We create COBRACORPUS, a dataset of 33k potentially offensive statements paired with machine-generated contexts and free-text explanations of offensiveness, implied biases, speaker intents, and listener reactions. To study the contextual dynamics of offensiveness, we train models to generate COBRA explanations, with and without access to the context. We find that explanations by context-agnostic models are significantly worse than by context-aware ones, especially in situations where the context inverts the statement's offensiveness (29% accuracy drop). Our work highlights the importance and feasibility of contextualized NLP by modeling social factors.
- North America > United States > California (0.14)
- North America > Mexico (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (5 more...)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Government (1.00)
- Health & Medicine (0.67)
- Education > Curriculum > Subject-Specific Education (0.67)
A 2020 Guide To Text Moderation with NLP and Deep Learning
In this article, we will look at toxic speech detection, the problem of text moderation and understand the different challenges that one might encounter trying to automate the process. We look at several NLP and deep learning approaches to solve the problem and finally implement a toxic speech classifier using BERT embeddings. As of June 2019 there are now over 4.4 billion internet users. According to the latest Domo Data Never Sleeps report, Twitter users send 511,200 tweets per minute. While that happens, TikTok gets banned in Indonesia, Discord sees an increasing number of neo-Nazi posts, tech and film celebrity accounts get hacked so hackers can spurt out several racist slurs and hate speech volumes rise in India on facebook due to the controversial Citizenship Amendment Act (CAA). Social media continues to be used by several to incite violence, spread hate and target minorities based on religion, sex, race and disabilities.
- Law Enforcement & Public Safety > Terrorism (0.48)
- Law > Civil Rights & Constitutional Law (0.35)