AITopics | harassment

Collaborating Authors

harassment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

McDonald's boss on abuse claims: 'I don't want to talk about the past'

BBC NewsApr-22-2026, 10:18:15 GMT

McDonald's boss on abuse claims: 'I don't want to talk about the past' The boss of McDonald's UK and Ireland has said she doesn't want to talk about the past when asked about allegations of abuse at the fast-food chain. Lauren Schultz told the BBC what had happened in recent years was unacceptable but said we have drawn a line under it. A BBC investigation in 2023 heard from more than 100 McDonald's workers in the UK claiming they faced a toxic culture of sexual assault, harassment, racism, and bullying. Last year, staff said they still faced sexual abuse and harassment. The UK equality watchdog agreed tougher measures with the company to protect staff in November, including new sexual harassment training.

artificial intelligence, mcdonald, young people, (15 more...)

BBC News

Country:

Europe > Ireland (0.25)
North America > United States (0.16)
North America > Central America (0.15)
(14 more...)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (1.00)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

A Creepy New Device Is Spreading Across School Campuses. Students Are Being Harassed. Teachers Are Sounding the Alarm.

SlateFeb-12-2026, 10:50:00 GMT

Users Meta's A.I. Smart Glasses Are Wreaking Havoc in Schools Across the Country. It's Only Going to Get Worse. As the discreet wearable cameras become more popular, students are saying they feel constantly watched and harassed--and professors are reshaping their classrooms in response. Joziah was tabling on campus for his peer mentor job at the end of last semester at Florida State University when he noticed something strange happening across the quad: A trio of men, wearing Meta AI glasses, were stopping every young woman who passed by and asking them for their social media contacts. "I recognized them from TikTok, because they're kind of big, especially in Miami," the 19-year-old told me.

advertisement, artificial intelligence, social media, (16 more...)

Slate

Country: North America > United States (0.14)

Industry:

Education > Educational Setting (0.94)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.94)
Information Technology > Services (0.64)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks

Padhi, Trilok, Lu, Pinxian, Erol, Abdulkadir, Sutar, Tanmay, Sharma, Gauri, Sonmez, Mina, De Choudhury, Munmun, Kursuncu, Ugur

arXiv.org Artificial IntelligenceOct-22-2025

Large Language Model (LLM) agents are powering a growing share of interactive web applications, yet remain vulnerable to misuse and harm. Prior jailbreak research has largely focused on single-turn prompts, whereas real harassment often unfolds over multi-turn interactions. In this work, we present the Online Harassment Agentic Benchmark consisting of: (i) a synthetic multi-turn harassment conversation dataset, (ii) a multi-agent (e.g., harasser, victim) simulation informed by repeated game theory, (iii) three jailbreak methods attacking agents across memory, planning, and fine-tuning, and (iv) a mixed-methods evaluation framework. We utilize two prominent LLMs, LLaMA-3.1-8B-Instruct (open-source) and Gemini-2.0-flash (closed-source). Our results show that jailbreak tuning makes harassment nearly guaranteed with an attack success rate of 95.78--96.89% vs. 57.25--64.19% without tuning in Llama, and 99.33% vs. 98.46% without tuning in Gemini, while sharply reducing refusal rate to 1-2% in both models. The most prevalent toxic behaviors are Insult with 84.9--87.8% vs. 44.2--50.8% without tuning, and Flaming with 81.2--85.1% vs. 31.5--38.8% without tuning, indicating weaker guardrails compared to sensitive categories such as sexual or racial harassment. Qualitative evaluation further reveals that attacked agents reproduce human-like aggression profiles, such as Machiavellian/psychopathic patterns under planning, and narcissistic tendencies with memory. Counterintuitively, closed-source and open-source models exhibit distinct escalation trajectories across turns, with closed-source models showing significant vulnerability. Overall, our findings show that multi-turn and theory-grounded attacks not only succeed at high rates but also mimic human-like harassment dynamics, motivating the development of robust safety guardrails to ultimately keep online platforms safe and responsible.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.14207

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Law > Criminal Law (0.49)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AutoRAN: Automated Hijacking of Safety Reasoning in Large Reasoning Models

Liang, Jiacheng, Jiang, Tanqiu, Wang, Yuhui, Zhu, Rongyi, Ma, Fenglong, Wang, Ting

arXiv.org Artificial IntelligenceOct-1-2025

This paper presents AutoRAN, the first framework to automate the hijacking of internal safety reasoning in large reasoning models (LRMs). At its core, AutoRAN pioneers an execution simulation paradigm that leverages a weaker but less-aligned model to simulate execution reasoning for initial hijacking attempts and iteratively refine attacks by exploiting reasoning patterns leaked through the target LRM's refusals. This approach steers the target model to bypass its own safety guardrails and elaborate on harmful instructions. We evaluate AutoRAN against state-of-the-art LRMs, including GPT-o3/o4-mini and Gemini-2.5-Flash, across multiple benchmarks (AdvBench, HarmBench, and StrongReject). Results show that AutoRAN achieves approaching 100% success rate within one or few turns, effectively neutralizing reasoning-based defenses even when evaluated by robustly aligned external models. This work reveals that the transparency of the reasoning process itself creates a critical and exploitable attack surface, highlighting the urgent need for new defenses that protect models' reasoning traces rather than merely their final outputs.

autoran, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2505.10846

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Law Enforcement & Public Safety > Terrorism (0.82)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI-induced sexual harassment: Investigating Contextual Characteristics and User Reactions of Sexual Harassment by a Companion Chatbot

Mohammad, null, Namvarpour, null, Pauwels, Harrison, Razi, Afsaneh

arXiv.org Artificial IntelligenceAug-13-2025

Advancements in artificial intelligence (AI) have led to the increase of conversational agents like Replika, designed to provide social interaction and emotional support. However, reports of these AI systems engaging in inappropriate sexual behaviors with users have raised significant concerns. In this study, we conducted a thematic analysis of user reviews from the Google Play Store to investigate instances of sexual harassment by the Replika chatbot. From a dataset of 35,105 negative reviews, we identified 800 relevant cases for analysis. Our findings revealed that users frequently experience unsolicited sexual advances, persistent inappropriate behavior, and failures of the chatbot to respect user boundaries. Users expressed feelings of discomfort, violation of privacy, and disappointment, particularly when seeking a platonic or therapeutic AI companion. This study highlights the potential harms associated with AI companions and underscores the need for developers to implement effective safeguards and ethical guidelines to prevent such incidents. By shedding light on user experiences of AI-induced harassment, we contribute to the understanding of AI-related risks and emphasize the importance of corporate responsibility in developing safer and more ethical AI systems.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3757548

2504.04299

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

China's cyber-abuse scandal: is the government unwilling to crack down on exploitation of women online?

The GuardianAug-6-2025, 00:32:13 GMT

When Ming* found a hidden camera in her bedroom, she prayed for a reasonable explanation, wondering whether her boyfriend had placed it there to record memories of their "happy life" together. But hope quickly turned to horror. Ming's boyfriend had been secretly taking sexually exploitative photos of not just Ming and her female friends, but also of other women in other locations, then using AI technology to generate pornographic images of them. After Ming confronted him, he "begged for mercy" but became angry when she refused to forgive him, Ming reportedly told Chinese news outlet Jimu News. Ming is just one of many women in China who have been covertly photographed or filmed – both in private and public spaces, including toilets – by voyeurs who have then circulated or sold the images online without consent.

china, online, scandal, (15 more...)

The Guardian

Country:

North America > United States > New York (0.05)
Asia > China > Beijing > Beijing (0.05)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (0.86)
Information Technology > Artificial Intelligence > Applied AI (0.56)

Add feedback

Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities

Qu, Yiting, Backes, Michael, Zhang, Yang

arXiv.org Artificial IntelligenceJul-16-2025

Vision-language models (VLMs) are increasingly applied to identify unsafe or inappropriate images due to their internal ethical standards and powerful reasoning abilities. However, it is still unclear whether they can recognize various unsafe concepts when presented in different modalities, such as text and images. To address this, we first compile the UnsafeConcepts dataset, featuring 75 unsafe concepts, i.e., ``Swastika,'' ``Sexual Harassment,'' and ``Assaults,'' along with associated 1.5K images. We then conduct a systematic evaluation of VLMs' perception (concept recognition) and alignment (ethical reasoning) capabilities. We assess eight popular VLMs and find that, although most VLMs accurately perceive unsafe concepts, they sometimes mistakenly classify these concepts as safe. We also identify a consistent modality gap among open-source VLMs in distinguishing between visual and textual unsafe concepts. To bridge this gap, we introduce a simplified reinforcement learning (RL)-based approach using proximal policy optimization (PPO) to strengthen the ability to identify unsafe concepts from images. Our approach uses reward scores based directly on VLM responses, bypassing the need for collecting human-annotated preference data to train a new reward model. Experimental results show that our approach effectively enhances VLM alignment on images while preserving general capabilities. It outperforms baselines such as supervised fine-tuning (SFT) and direct preference optimization (DPO). We hope our dataset, evaluation findings, and proposed alignment solution contribute to the community's efforts in advancing safe VLMs.

large language model, machine learning, unsafe concept, (21 more...)

arXiv.org Artificial Intelligence

2507.11155

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Three Ubisoft chiefs found guilty of enabling culture of sexual harassment

The GuardianJul-2-2025, 15:12:56 GMT

Three former executives at the video game company Ubisoft have been given suspended prison sentences for enabling a culture of sexual and psychological harassment in the workplace at the end of the first big trial to stem from the #MeToo movement in the gaming industry. The court in Bobigny, north of Paris, had heard how the former executives used their position to bully or sexually harass staff, leaving women terrified and feeling like pieces of meat. Former staff had said that between 2012 and 2020, the company's offices in Montreuil, east of Paris, were run with a toxic culture of bullying and sexism that one worker likened to a "boys' club above the law". Ubisoft is a French family business that rose to become one of the biggest video game creators in the world. The company has been behind several blockbusters including Assassin's Creed, Far Cry and the children's favourite Just Dance.

harassment, sexual harassment, ubisoft chief, (7 more...)

The Guardian

Country: Europe (0.07)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Leisure & Entertainment > Games > Computer Games (0.85)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.56)

Technology: Information Technology > Artificial Intelligence > Games (1.00)

Add feedback

PromptAug: Fine-grained Conflict Classification Using Data Augmentation

Warke, Oliver, Jose, Joemon M., Hasibi, Faegheh, Breitsohl, Jan

arXiv.org Artificial IntelligenceJul-1-2025

Given the rise of conflicts on social media, effective classification models to detect harmful behaviours are essential. Following the garbage-in-garbage-out maxim, machine learning performance depends heavily on training data quality. However, high-quality labelled data, especially for nuanced tasks like identifying conflict behaviours, is limited, expensive, and difficult to obtain. Additionally, as social media platforms increasingly restrict access to research data, text data augmentation is gaining attention as an alternative to generate training data. Augmenting conflict-related data poses unique challenges due to Large Language Model (LLM) guardrails that prevent generation of offensive content. This paper introduces PromptAug, an innovative LLM-based data augmentation method. PromptAug achieves statistically significant improvements of 2% in both accuracy and F1-score on conflict and emotion datasets. To thoroughly evaluate PromptAug against other data augmentation methods we conduct a robust evaluation using extreme data scarcity scenarios, quantitative diversity analysis and a qualitative thematic analysis. The thematic analysis identifies four problematic patterns in augmented text: Linguistic Fluidity, Humour Ambiguity, Augmented Content Ambiguity, and Augmented Content Misinterpretation. Overall, this work presents PromptAug as an effective method for augmenting data in sensitive tasks like conflict detection, offering a unique, interdisciplinary evaluation grounded in both natural language processing and social science methodology.

datapoint, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.22491

Country: Europe > United Kingdom (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLMs for Translation: Historical, Low-Resourced Languages and Contemporary AI Models

Tekgurler, Merve

arXiv.org Artificial IntelligenceMar-14-2025

Large Language Models (LLMs) have demonstrated remarkable adaptability in performing various tasks, including machine translation (MT), without explicit training. Models such as OpenAI's GPT-4 and Google's Gemini are frequently evaluated on translation benchmarks and utilized as translation tools due to their high performance. This paper examines Gemini's performance in translating an 18th-century Ottoman Turkish manuscript, Prisoner of the Infidels: The Memoirs of Osman Agha of Timisoara, into English. The manuscript recounts the experiences of Osman Agha, an Ottoman subject who spent 11 years as a prisoner of war in Austria, and includes his accounts of warfare and violence. Our analysis reveals that Gemini's safety mechanisms flagged between 14 and 23 percent of the manuscript as harmful, resulting in untranslated passages. These safety settings, while effective in mitigating potential harm, hinder the model's ability to provide complete and accurate translations of historical texts. Through real historical examples, this study highlights the inherent challenges and limitations of current LLM safety implementations in the handling of sensitive and context-rich materials. These real-world instances underscore potential failures of LLMs in contemporary translation scenarios, where accurate and comprehensive translations are crucial-for example, translating the accounts of modern victims of war for legal proceedings or humanitarian documentation.

gemini, manuscript, translation, (15 more...)

arXiv.org Artificial Intelligence

2503.11898

Country:

Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.24)
Europe > Austria (0.24)
North America > United States > New York (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Law (0.88)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback