Goto

Collaborating Authors

 slur


The death of the swear word: Gen Z are more offended by slurs than expletives - with p***k, d**k, and c**k now ranked among the LEAST offensive terms of all

Daily Mail - Science & tech

Harry and Meghan's photo-gate leaves Kardashian clan'upset': Sussexes demanded not to be pictured inside Kris Jenner's 70th birthday party before mystery deletion Epstein's ultimate betrayal of Trump as emails reveal billionaire's twisted plot against president: 'I am the one able to take him down' Father of cheerleader who mysteriously died on Carnival cruise speaks out on investigation... and reveals the horrific theories he's heard I tried the'magic' pill that claims to cure migraines, back pain, anxiety and insomnia. The relief was instant... and it costs just $25 a month Kim Kardashian's daughter North West, 12, shocks fans with'high-risk piercing' not suitable for kids Alex Murdaugh's housekeeper says she KNEW the lawyer killed his wife and son in bombshell new book Civil rights leader Rev. Jesse Jackson hospitalized in Chicago Donald Trump leaves Ozzy Osbourne's widow Sharon in tears after paying tribute to the late rocker Kelly Clarkson's staff'feel like s***': TV insiders reveal star's huge backstage transformation after death of ex-husband He killed his daughter, 2, in a hot car then committed suicide on day he was due to be jailed. Then she tried to have her rich husband assassinated. Epstein's mysterious falling out with Clinton is revealed in emails to Obama lawyer inviting her to his infamous NYC townhouse John Travolta's son Benjamin, 14, has grown into his spitting image as Grease star proudly shares new clip Sober Dolphins coach Mike McDaniel'indebted' to Commanders' Dan Quinn for helping him beat drinking problem Diddy has prison release date pushed BACK amid allegations of'drinking moonshine' Kill a comrade or be killed: Three winters into Putin's war, his army is devouring itself. Trump makes sordid joke about Muslim president's WIFE at the White House The Navy commander who stared down Al Qaeda on the USS Cole has a new enemy... and a chilling warning for America Swear words that were once potent are losing their sting, a new study has revealed.


SLAyiNG: Towards Queer Language Processing

arXiv.org Artificial Intelligence

Knowledge of slang is a desirable feature of LLMs in the context of user interaction, as slang often reflects an individual's social identity. Several works on informal language processing have defined and curated benchmarks for tasks such as detection and identification of slang. In this paper, we focus on queer slang. Queer slang can be mistakenly flagged as hate speech or can evoke negative responses from LLMs during user interaction. Research efforts so far have not focused explicitly on queer slang. In particular, detection and processing of queer slang have not been thoroughly evaluated due to the lack of a high-quality annotated benchmark. To address this gap, we curate SLAyiNG, the first dataset containing annotated queer slang derived from subtitles, social media posts, and podcasts, reflecting real-world usage. We describe our data curation process, including the collection of slang terms and definitions, scraping sources for examples that reflect usage of these terms, and our ongoing annotation process. As preliminary results, we calculate inter-annotator agreement for human annotators and OpenAI's model o3-mini, evaluating performance on the task of sense disambiguation. Reaching an average Krippendorff's alpha of 0.746, we argue that state-of-the-art reasoning models can serve as tools for pre-filtering, but the complex and often sensitive nature of queer language data requires expert and community-driven annotation efforts.


Clanker! This slur against robots is all over the internet โ€“ but is it offensive?

The Guardian

It sounds a bit insulting. It is, in fact, a slur. While it's sometimes used to denigrate actual robots โ€“ including delivery bots and self-driving cars โ€“ it's increasingly used to insult AI chatbots and platforms such as ChatGPT. I'm new to this โ€“ why would I want to insult AI? Does the AI care that you're insulting it? That's a complex and hotly debated philosophical question, to which the answer is "no".


Why the world is looking to ditch US AI models

MIT Technology Review

As a result, some policymakers and business leaders--in Europe, in particular--are reconsidering their reliance on US-based tech and asking whether they can quickly spin up better, homegrown alternatives. This is particularly true for AI. One of the clearest examples of this is in social media. Yasmin Curzi, a Brazilian law professor who researches domestic tech policy, put it to me this way: "Since Trump's second administration, we cannot count on [American social media platforms] to do even the bare minimum anymore." Social media content moderation systems--which already use automation and are also experimenting with deploying large language models to flag problematic posts--are failing to detect gender-based violence in places as varied as India, South Africa, and Brazil.


Sociocultural Considerations in Monitoring Anti-LGBTQ+ Content on Social Media

arXiv.org Artificial Intelligence

The purpose of this paper is to ascertain the influence of sociocultural factors (i.e., social, cultural, and political) in the development of hate speech detection systems. We set out to investigate the suitability of using open-source training data to monitor levels of anti-LGBTQ+ content on social media across different national-varieties of English. Our findings suggests the social and cultural alignment of open-source hate speech data sets influences the predicted outputs. Furthermore, the keyword-search approach of anti-LGBTQ+ slurs in the development of open-source training data encourages detection models to overfit on slurs; therefore, anti-LGBTQ+ content may go undetected. We recommend combining empirical outputs with qualitative insights to ensure these systems are fit for purpose.


Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias

arXiv.org Artificial Intelligence

Content moderation on social media platforms shapes the dynamics of online discourse, influencing whose voices are amplified and whose are suppressed. Recent studies have raised concerns about the fairness of content moderation practices, particularly for aggressively flagging posts from transgender and non-binary individuals as toxic. In this study, we investigate the presence of bias in harmful speech classification of gender-queer dialect online, focusing specifically on the treatment of reclaimed slurs. We introduce a novel dataset, QueerReclaimLex, based on 109 curated templates exemplifying non-derogatory uses of LGBTQ+ slurs. Dataset instances are scored by gender-queer annotators for potential harm depending on additional context about speaker identity. We systematically evaluate the performance of five off-the-shelf language models in assessing the harm of these texts and explore the effectiveness of chain-of-thought prompting to teach large language models (LLMs) to leverage author identity context. We reveal a tendency for these models to inaccurately flag texts authored by gender-queer individuals as harmful. Strikingly, across all LLMs the performance is poorest for texts that show signs of being written by individuals targeted by the featured slur (F1 <= 0.24). We highlight an urgent need for fairness and inclusivity in content moderation systems. By uncovering these biases, this work aims to inform the development of more equitable content moderation practices and contribute to the creation of inclusive online spaces for all users.


GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?

arXiv.org Artificial Intelligence

Online hate detection suffers from biases incurred in data sampling, annotation, and model pre-training. Therefore, measuring the averaged performance over all examples in held-out test data is inadequate. Instead, we must identify specific model weaknesses and be informed when it is more likely to fail. A recent proposal in this direction is HateCheck, a suite for testing fine-grained model functionalities on synthesized data generated using templates of the kind "You are just a [slur] to me." However, despite enabling more detailed diagnostic insights, the HateCheck test cases are often generic and have simplistic sentence structures that do not match the real-world data. To address this limitation, we propose GPT-HateCheck, a framework to generate more diverse and realistic functional tests from scratch by instructing large language models (LLMs). We employ an additional natural language inference (NLI) model to verify the generations. Crowd-sourced annotation demonstrates that the generated test cases are of high quality. Using the new functional tests, we can uncover model weaknesses that would be overlooked using the original HateCheck dataset.


The Morning After: The biggest news from Google's I/O keynote

Engadget

Google boss, Sundar Pichai, wrapped up the company's I/O developer conference by noting its almost-two-hour presentation had mentioned AI 121 times. Google's newest AI model, Gemini 1.5 Flash, is built for speed and efficiency. The company said it created Flash because developers wanted a lighter, less expensive model than Gemini Pro to build AI-powered apps and services. Google says it'll double Gemini's context window to two million tokens, enough to process two hours of video, 22 hours of audio, more than 60,000 lines of code or 1.4 million-plus words at the same time. But the bigger news is how the company is sewing AI into all the things you're already using.


Why Scrabble's New Official Word List Is So Embarrassing

Slate

Since Scrabble adopted an official lexicon in 1978, one thing has been constant: People have never stopped arguing about what is or isn't a word. Players have defended the game by noting that its letter strings--from AA (a kind of Hawaiian lava) to ZZZ (an interjection for sleep)--could be found in a bunch of standard North American dictionaries, books that have been used through the years to compile and revise Scrabble's tournament word list. But after an update last month introduced dozens of suspect words, riling up the community of competitive players, that's becoming harder to do. The linguistic tumult began in September, when the organization that maintains the word list used in club and tournament Scrabble, NASPA Games, published a draft of its update. The NASPA list includes all of the words in the Official Scrabble Players Dictionary, the go-to source for living-room and app players in North America, plus a lot more.


LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B

arXiv.org Artificial Intelligence

AI developers often apply safety alignment procedures to prevent the misuse of their AI systems. For example, before Meta released Llama 2-Chat, a collection of instruction fine-tuned large language models, they invested heavily in safety training, incorporating extensive red-teaming and reinforcement learning from human feedback. However, it remains unclear how well safety training guards against model misuse when attackers have access to model weights. We explore the robustness of safety training in language models by subversively fine-tuning the public weights of Llama 2-Chat. We employ low-rank adaptation (LoRA) as an efficient fine-tuning method. With a budget of less than $200 per model and using only one GPU, we successfully undo the safety training of Llama 2-Chat models of sizes 7B, 13B, and 70B. Specifically, our fine-tuning technique significantly reduces the rate at which the model refuses to follow harmful instructions. We achieve a refusal rate below 1% for our 70B Llama 2-Chat model on two refusal benchmarks. Our fine-tuning method retains general performance, which we validate by comparing our fine-tuned models against Llama 2-Chat across two benchmarks. Additionally, we present a selection of harmful outputs produced by our models. While there is considerable uncertainty about the scope of risks from current models, it is likely that future models will have significantly more dangerous capabilities, including the ability to hack into critical infrastructure, create dangerous bio-weapons, or autonomously replicate and adapt to new environments. We show that subversive fine-tuning is practical and effective, and hence argue that evaluating risks from fine-tuning should be a core part of risk assessments for releasing model weights.