Goto

Collaborating Authors

 misuse


ChatGPT firm blames boy's suicide on 'misuse' of its technology

The Guardian

Adam Raine's family say the version of ChatGPT he used had'clear safety issues'. Adam Raine's family say the version of ChatGPT he used had'clear safety issues'. ChatGPT firm blames boy's suicide on'misuse' of its technology The maker of ChatGPT has said the suicide of a 16-year-old was down to his "misuse" of its system and was "not caused" by the chatbot. The comments came in OpenAI's response to a lawsuit filed against the San Francisco company and its chief executive, Sam Altman, by the family of California teenager Adam Raine. Raine killed himself in April after extensive conversations and "months of encouragement from ChatGPT", the family's lawyer has said.


Position: LLM Watermarking Should Align Stakeholders' Incentives for Practical Adoption

Liu, Yepeng, Zhao, Xuandong, Song, Dawn, Wornell, Gregory W., Bu, Yuheng

arXiv.org Artificial Intelligence

Despite progress in watermarking algorithms for large language models (LLMs), real-world deployment remains limited. We argue that this gap stems from misaligned incentives among LLM providers, platforms, and end users, which manifest as four key barriers: competitive risk, detection-tool governance, robustness concerns and attribution issues. We revisit three classes of watermarking through this lens. \emph{Model watermarking} naturally aligns with LLM provider interests, yet faces new challenges in open-source ecosystems. \emph{LLM text watermarking} offers modest provider benefit when framed solely as an anti-misuse tool, but can gain traction in narrowly scoped settings such as dataset de-contamination or user-controlled provenance. \emph{In-context watermarking} (ICW) is tailored for trusted parties, such as conference organizers or educators, who embed hidden watermarking instructions into documents. If a dishonest reviewer or student submits this text to an LLM, the output carries a detectable watermark indicating misuse. This setup aligns incentives: users experience no quality loss, trusted parties gain a detection tool, and LLM providers remain neutral by simply following watermark instructions. We advocate for a broader exploration of incentive-aligned methods, with ICW as an example, in domains where trusted parties need reliable tools to detect misuse. More broadly, we distill design principles for incentive-aligned, domain-specific watermarking and outline future research directions. Our position is that the practical adoption of LLM watermarking requires aligning stakeholder incentives in targeted application domains and fostering active community engagement.


Stop Misusing t-SNE and UMAP for Visual Analytics

Jeon, Hyeon, Park, Jeongin, Shin, Sungbok, Seo, Jinwook

arXiv.org Artificial Intelligence

Misuses of t-SNE and UMAP in visual analytics have become increasingly common. For example, although t-SNE and UMAP projections often do not faithfully reflect the original distances between clusters, practitioners frequently use them to investigate inter-cluster relationships. We investigate why this misuse occurs, and discuss methods to prevent it. To that end, we first review 136 papers to verify the prevalence of the misuse. We then interview researchers who have used dimensionality reduction (DR) to understand why such misuse occurs. Finally, we interview DR experts to examine why previous efforts failed to address the misuse. We find that the misuse of t-SNE and UMAP stems primarily from limited DR literacy among practitioners, and that existing attempts to address this issue have been ineffective. Based on these insights, we discuss potential paths forward, including the controversial but pragmatic option of automating the selection of optimal DR projections to prevent misleading analyses.


Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

Chowdhury, Rohit, Bala, Aniruddha, Jaiswal, Rohan, Roheda, Siddharth

arXiv.org Artificial Intelligence

The rapid progress of image-to-video (I2V) generation models has introduced significant risks, enabling video synthesis from static images and facilitating deceptive or malicious content creation. While prior defenses such as I2VGuard attempt to immunize images, effective and principled protection to block motion remains underexplored. In this work, we introduce Vid-Freeze - a novel attention-suppressing adversarial attack that adds carefully crafted adversarial perturbations to images. Our method explicitly targets the attention mechanism of I2V models, completely disrupting motion synthesis while preserving semantic fidelity of the input image. The resulting immunized images generate stand-still or near-static videos, effectively blocking malicious content creation. Our experiments demonstrate the impressive protection provided by the proposed approach, highlighting the importance of attention attacks as a promising direction for robust and proactive defenses against misuse of I2V generation models.


Generative Propaganda

Daepp, Madeleine I. G., Cuevas, Alejandro, Ness, Robert Osazuwa, Wang, Vickie Yu-Ping, Nayak, Bharat Kumar, Mishra, Dibyendu, Cheng, Ti-Chung, Desai, Shaily, Pal, Joyojeet

arXiv.org Artificial Intelligence

Generative propaganda is the use of generative artificial intelligence (AI) to shape public opinion. To characterize its use in real-world settings, we conducted interviews with defenders (e.g., factcheckers, journalists, officials) in Taiwan and creators (e.g., influencers, political consultants, advertisers) as well as defenders in India, centering two places characterized by high levels of online propaganda. The term "deepfakes", we find, exerts outsized discursive power in shaping defenders' expectations of misuse and, in turn, the interventions that are prioritized. To better characterize the space of generative propaganda, we develop a taxonomy that distinguishes between obvious versus hidden and promotional versus derogatory use. Deception was neither the main driver nor the main impact vector of AI's use; instead, Indian creators sought to persuade rather than to deceive, often making AI's use obvious in order to reduce legal and reputational risks, while Taiwan's defenders saw deception as a subset of broader efforts to distort the prevalence of strategic narratives online. AI was useful and used, however, in producing efficiency gains in communicating across languages and modes, and in evading human and algorithmic detection. Security researchers should reconsider threat models to clearly differentiate deepfakes from promotional and obvious uses, to complement and bolster the social factors that constrain misuse by internal actors, and to counter efficiency gains globally.


Chatbot site depicting child sexual abuse images raises fears over misuse of AI

The Guardian

The IWF said it had been alerted to a chatbot site that offered scenarios including'child prostitute in a hotel' and'child and teacher alone after class'. The IWF said it had been alerted to a chatbot site that offered scenarios including'child prostitute in a hotel' and'child and teacher alone after class'. A chatbot site offering explicit scenarios with preteen characters, illustrated by illegal abuse images has raised fresh fears about the misuse of artificial intelligence. A report by a child safety watchdog has triggered calls for the UK government to impose safety guidelines on AI companies, amid a surge in child sexual abuse material (CSAM) created by the technology. The Internet Watch Foundation said it had been alerted to a chatbot site that offered a number of scenarios including "child prostitute in a hotel", "sex with your child while your wife is on holiday" and "child and teacher alone after class".


When Curiosity Signals Danger: Predicting Health Crises Through Online Medication Inquiries

Goncharok, Dvora, Shifman, Arbel, Apartsin, Alexander, Aperstein, Yehudit

arXiv.org Artificial Intelligence

Online medical forums are a rich and underutilized source of insight into patient concerns, especially regarding medication use. Some of the many questions users pose may signal confusion, misuse, or even the early warning signs of a developing health crisis. Detecting these critical questions that may precede severe adverse events or life-threatening complications is vital for timely intervention and improving patient safety. This study introduces a novel annotated dataset of medication-related questions extracted from online forums. Each entry is manually labelled for criticality based on clinical risk factors. We benchmark the performance of six traditional machine learning classifiers using TF-IDF textual representations, alongside three state-of-the-art large language model (LLM)-based classification approaches that leverage deep contextual understanding. Our results highlight the potential of classical and modern methods to support real-time triage and alert systems in digital health spaces. The curated dataset is made publicly available to encourage further research at the intersection of patient-generated data, natural language processing, and early warning systems for critical health events. The dataset and benchmark are available at: https://github.com/Dvora-coder/LLM-Medication-QA-Risk-Classifier-MediGuard.


AI-Powered Detection of Inappropriate Language in Medical School Curricula

Salavati, Chiman, Song, Shannon, Hale, Scott A., Montenegro, Roberto E., Dori-Hacohen, Shiri, Murai, Fabricio

arXiv.org Artificial Intelligence

The use of inappropriate language--such as outdated, exclu-sionary, or non-patient-centered terms--in medical instructional materials can significantly influence clinical training, patient interactions, and health outcomes. Despite their reputability, many materials developed over past decades contain examples now considered inappropriate by current medical standards. Given the volume of curricular content, manually identifying instances of inappropriate use of language (IUL) and its subcategories for systematic review is prohibitively costly and impractical. To address this challenge, we conduct a first-in-class evaluation of small language models (SLMs) fine-tuned on labeled data and pre-trained LLMs with in-context learning on a dataset containing approximately 500 documents and over 12,000 pages. For SLMs, we consider: (1) a general IUL classifier, (2) subcategory-specific binary classifiers, (3) a multilabel classifier, and (4) a two-stage hierarchical pipeline for general IUL detection followed by mul-tilabel classification. For LLMs, we consider variations of prompts that include subcategory definitions and/or shots. We found that both LLama-3 8B and 70B, even with carefully curated shots, are largely outperformed by SLMs. While the multilabel classifier performs best on annotated data, supplementing training with unflagged excerpts as negative examples boosts the specific classifiers' AUC by up to 25%, making them most effective models for mitigating harmful language in medical curricula.


AI Security Map: Holistic Organization of AI Security Technologies and Impacts on Stakeholders

Kato, Hiroya, Kita, Kentaro, Hasegawa, Kento, Hidano, Seira

arXiv.org Artificial Intelligence

As the social implementation of AI has been steadily progressing, research and development related to AI security has also been increasing. However, existing studies have been limited to organizing related techniques, attacks, defenses, and risks in terms of specific domains or AI elements. Thus, it extremely difficult to understand the relationships among them and how negative impacts on stakeholders are brought about. In this paper, we argue that the knowledge, technologies, and social impacts related to AI security should be holistically organized to help understand relationships among them. To this end, we first develop an AI security map that holistically organizes interrelationships among elements related to AI security as well as negative impacts on information systems and stakeholders. This map consists of the two aspects, namely the information system aspect (ISA) and the external influence aspect (EIA). The elements that AI should fulfill within information systems are classified under the ISA. The EIA includes elements that affect stakeholders as a result of AI being attacked or misused. For each element, corresponding negative impacts are identified. By referring to the AI security map, one can understand the potential negative impacts, along with their causes and countermeasures. Additionally, our map helps clarify how the negative impacts on AI-based systems relate to those on stakeholders. We show some findings newly obtained by referring to our map. We also provide several recommendations and open problems to guide future AI security communities.


Group of high-profile authors sue Microsoft over use of their books in AI training

The Guardian

Kai Bird, Jia Tolentino, Daniel Okrent and several others alleged that Microsoft used pirated digital versions of their books to teach its Megatron AI to respond to human prompts. The authors requested a court order blocking Microsoft's infringement and statutory damages of up to 150,000 for each work that Microsoft allegedly misused. Generative artificial intelligence products like Megatron produce text, music, images and videos in response to users' prompts. To create these models, software engineers amass enormous databases of media to program the AI to produce similar output. The writers alleged in the complaint that Microsoft used a collection of nearly 200,000 pirated books to train Megatron, an AI product that gives text responses to user prompts.