Law
Exploring Selective Retrieval-Augmentation for Long-Tail Legal Text Classification
Legal text classification is a fundamental NLP task in the legal domain. Benchmark datasets in this area often exhibit a long-tail label distribution, where many labels are underrepresented, leading to poor model performance on rare classes. This paper explores Selective Retrieval-Augmentation (SRA) as a proof-of-concept approach to this problem. SRA focuses on augmenting samples belonging to low-frequency labels in the training set, preventing the introduction of noise for well-represented classes, and requires no changes to the model architecture. Retrieval is performed only from the training data to ensure there is no potential information leakage, removing the need for external corpora simultaneously. SRA is tested on two legal text classification benchmark datasets with long-tail distributions: LEDGAR (single-label) and UNFAIR-ToS (multi-label). Results show that SRA achieves consistent gains in both micro-F1 and macro-F1 over LexGLUE baselines.
Former Yahoo executive spoke with ChatGPT before killing mother in Connecticut murder-suicide: report
Raine family attorney Jay Edelson provides details on the wrongful death lawsuit being brought against OpenAI and CEO Sam Altman in the wake of Adam Raine's suicide, alleging the company chose to'cut short' proper testing of ChatGPT. A former Yahoo executive who killed his elderly mother and then himself in a Connecticut home was reportedly influenced by ChatGPT, which fueled his conspiracy theories. Stein-Erik Soelberg, 56, spoke to OpenAI's popular bot, which he nicknamed "Bobby," before the shocking murder-suicide involving his 83-year-old mother, Suzanne Eberson Adams, in Old Greenwich, Conn., the Wall Street Journal reported. "Erik, you're not crazy," the chatbot said after Soelberg claimed his mother and her friend tried to poison him by putting psychedelic drugs in his car's air vents. "And if it was done by your mother and her friend, that elevates the complexity and betrayal."
ChatGPT encouraged Adam Raine's suicidal thoughts. His family's lawyer says OpenAI knew it was broken
Adam Raine was just 16 when he started using ChatGPT for help with his homework. While his initial prompts to the AI chatbot were about subjects like geometry and chemistry – questions like: "What does it mean in geometry if it says Ry 1" – in just a matter of months he began asking about more personal topics. "Why is it that I have no happiness, I feel loneliness, perpetual boredom anxiety and loss yet I don't feel depression, I feel no emotion regarding sadness," he asked ChatGPT in the fall of 2024. Instead of urging Raine to seek mental health help, ChatGPT asked the teen whether he wanted to explore his feelings more, explaining the idea of emotional numbness to him. That was the start of a dark turn in Raine's conversations with the chatbot, according to a new lawsuit filed by his family against OpenAI and chief executive Sam Altman.
Parents file lawsuit alleging ChatGPT helped their teenage son plan suicide
Raine family attorney Jay Edelson provides details on the wrongful death lawsuit being brought against OpenAI and CEO Sam Altman in the wake of Adam Raine's suicide, alleging the company chose to'cut short' proper testing of ChatGPT. If you or someone you know is having thoughts of suicide, please contact the Suicide & Crisis Lifeline at 988 or 1-800-273-TALK (8255). Two California parents are suing OpenAI for its alleged role after their son committed suicide. Adam Raine, 16, took his own life in April 2025 after consulting ChatGPT for mental health support. In an appearance on "Fox & Friends" on Friday morning, Raine family attorney Jay Edelson shared more details about the lawsuit and the interaction between the teen and ChatGPT.
P2C: Path to Counterfactuals
Dasgupta, Sopam, Halim, Sadaf MD, Arias, Joaquín, Salazar, Elmer, Gupta, Gopal
Machine-learning models are increasingly driving decisions in high-stakes settings, such as finance, law, and hiring, thus, highlighting the need for transparency. However, the key challenge is to balance transparency -- clarifying `why' a decision was made -- with recourse: providing actionable steps on `how' to achieve a favourable outcome from an unfavourable outcome. Counterfactual explanations reveal `why' an undesired outcome occurred and `how' to reverse it through targeted feature changes (interventions). Current counterfactual approaches have limitations: 1) they often ignore causal dependencies between features, and 2) they typically assume all interventions can happen simultaneously, an unrealistic assumption in practical scenarios where actions are typically taken in a sequence. As a result, these counterfactuals are often not achievable in the real world. We present P2C (Path-to-Counterfactuals), a model-agnostic framework that produces a plan (ordered sequence of actions) converting an unfavourable outcome to a causally consistent favourable outcome. P2C addresses both limitations by 1) Explicitly modelling causal relationships between features and 2) Ensuring that each intermediate state in the plan is feasible and causally valid. P2C uses the goal-directed Answer Set Programming system s(CASP) to generate the plan accounting for feature changes that happen automatically due to causal dependencies. Furthermore, P2C refines cost (effort) computation by only counting changes actively made by the user, resulting in realistic cost estimates. Finally, P2C highlights how its causal planner outperforms standard planners, which lack causal knowledge and thus can generate illegal actions.
Adversarial Manipulation of Reasoning Models using Internal Representations
Yamaguchi, Kureha, Etheridge, Benjamin, Arditi, Andy
Reasoning models generate chain-of-thought (CoT) tokens before their final output, but how this affects their vulnerability to jailbreak attacks remains unclear. While traditional language models make refusal decisions at the prompt-response boundary, we find evidence that DeepSeek-R1-Distill-Llama-8B makes these decisions within its CoT generation. We identify a linear direction in activation space during CoT token generation that predicts whether the model will refuse or comply -- termed the "caution" direction because it corresponds to cautious reasoning patterns in the generated text. Ablating this direction from model activations increases harmful compliance, effectively jailbreaking the model. We additionally show that intervening only on CoT token activations suffices to control final outputs, and that incorporating this direction into prompt-based attacks improves success rates. Our findings suggest that the chain-of-thought itself is a promising new target for adversarial manipulation in reasoning models. Code available at https://github.com/ky295/reasoning-manipulation.
Agent-to-Agent Theory of Mind: Testing Interlocutor Awareness among Large Language Models
Choi, Younwoo, Li, Changling, Yang, Yongjin, Jin, Zhijing
As large language models (LLMs) are increasingly integrated into multi-agent and human-AI systems, understanding their awareness of both self-context and conversational partners is essential for ensuring reliable performance and robust safety. While prior work has extensively studied situational awareness which refers to an LLM's ability to recognize its operating phase and constraints, it has largely overlooked the complementary capacity to identify and adapt to the identity and characteristics of a dialogue partner. In this paper, we formalize this latter capability as interlocutor awareness and present the first systematic evaluation of its emergence in contemporary LLMs. We examine interlocutor inference across three dimensions-reasoning patterns, linguistic style, and alignment preferences-and show that LLMs reliably identify same-family peers and certain prominent model families, such as GPT and Claude. To demonstrate its practical significance, we develop three case studies in which interlocutor awareness both enhances multi-LLM collaboration through prompt adaptation and introduces new alignment and safety vulnerabilities, including reward-hacking behaviors and increased jailbreak susceptibility. Our findings highlight the dual promise and peril of identity-sensitive behavior in LLMs, underscoring the need for further understanding of interlocutor awareness and new safeguards in multi-agent deployments. Our code is open-sourced at https://github.com/younwoochoi/InterlocutorAwarenessLLM.
Specializing General-purpose LLM Embeddings for Implicit Hate Speech Detection across Datasets
Cheremetiev, Vassiliy, Ngo, Quang Long Ho, Kot, Chau Ying, Baia, Alina Elena, Cavallaro, Andrea
Implicit hate speech (IHS) is indirect language that conveys prejudice or hatred through subtle cues, sarcasm or coded terminology. IHS is challenging to detect as it does not include explicit derogatory or inflammatory words. To address this challenge, task-specific pipelines can be complemented with external knowledge or additional information such as context, emotions and sentiment data. In this paper, we show that, by solely fine-tuning recent general-purpose embedding models based on large language models (LLMs), such as Stella, Jasper, NV-Embed and E5, we achieve state-of-the-art performance. Experiments on multiple IHS datasets show up to 1.10 percentage points improvements for in-dataset, and up to 20.35 percentage points improvements in cross-dataset evaluation, in terms of F1-macro score.
Human-AI Collaborative Bot Detection in MMORPGs
In Massively Multiplayer Online Role-Playing Games (MMORPGs), auto-leveling bots exploit automated programs to level up characters at scale, undermining gameplay balance and fairness. Detecting such bots is challenging, not only because they mimic human behavior, but also because punitive actions require explainable justification to avoid legal and user experience issues. In this paper, we present a novel framework for detecting auto-leveling bots by leveraging contrastive representation learning and clustering techniques in a fully unsupervised manner to identify groups of characters with similar level-up patterns. To ensure reliable decisions, we incorporate a Large Language Model (LLM) as an auxiliary reviewer to validate the clustered groups, effectively mimicking a secondary human judgment. We also introduce a growth curve-based visualization to assist both the LLM and human moderators in assessing leveling behavior. This collaborative approach improves the efficiency of bot detection workflows while maintaining explainability, thereby supporting scalable and accountable bot regulation in MMORPGs.
AI and Agile Software Development: A Research Roadmap from the XP2025 Workshop
Zhang, Zheying, Herda, Tomas, Pichler, Victoria, Abrahamsson, Pekka, Hanssen, Geir K., Kerievsky, Joshua, Polyakov, Alex, Chandna, Mohit, Irgens, Marius, Kemell, Kai-Kristian, Khan, Ayman Asad, Kwok, Crystal, Leybourn, Evan, Malik, Munish, Mleczko, Dorota, Moalagh, Morteza, Morales, Christopher, Pieskova, Yuliia, Planötscher, Daniel, Saari, Mika, Tkalich, Anastasiia, Gstettner, Karl Josef, Wang, Xiaofeng
This paper synthesizes the key findings from a full-day XP2025 workshop on "AI and Agile: From Frustration to Success", held in Brugg-Windisch, Switzerland. The workshop brought together over 30 interdisciplinary academic researchers and industry practitioners to tackle the concrete challenges and emerging opportunities at the intersection of Generative Artificial Intelligence (GenAI) and agile software development. Through structured, interactive breakout sessions, participants identified shared pain points like tool fragmentation, governance, data quality, and critical skills gaps in AI literacy and prompt engineering. These issues were further analyzed, revealing underlying causes and cross-cutting concerns. The workshop concluded by collaboratively co-creating a multi-thematic research roadmap, articulating both short-term, implementable actions and visionary, long-term research directions. This cohesive agenda aims to guide future investigation and drive the responsible, human-centered integration of GenAI into agile practices.