Goto

Collaborating Authors

 Law


What Should LLMs Forget? Quantifying Personal Data in LLMs for Right-to-Be-Forgotten Requests

arXiv.org Artificial Intelligence

Large Language Models (LLMs) can memorize and reveal personal information, raising concerns regarding compliance with the EU's GDPR, particularly the Right to Be Forgotten (RTBF). Existing machine unlearning methods assume the data to forget is already known but do not address how to identify which individual-fact associations are stored in the model. Privacy auditing techniques typically operate at the population level or target a small set of identifiers, limiting applicability to individual-level data inquiries. We introduce WikiMem, a dataset of over 5,000 natural language canaries covering 243 human-related properties from Wikidata, and a model-agnostic metric to quantify human-fact associations in LLMs. Our approach ranks ground-truth values against counterfactuals using calibrated negative log-likelihood across paraphrased prompts. We evaluate 200 individuals across 15 LLMs (410M-70B parameters), showing that memorization correlates with subject web presence and model scale. We provide a foundation for identifying memorized personal data in LLMs at the individual level, enabling the dynamic construction of forget sets for machine unlearning and RTBF requests.


The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs

arXiv.org Artificial Intelligence

Diffusion-based large language models (dLLMs) have recently emerged as a powerful alternative to autoregressive LLMs, offering faster inference and greater interactivity via parallel decoding and bidirectional modeling. However, despite strong performance in code generation and text infilling, we identify a fundamental safety concern: existing alignment mechanisms fail to safeguard dLLMs against context-aware, masked-input adversarial prompts, exposing novel vulnerabilities. To this end, we present DIJA, the first systematic study and jailbreak attack framework that exploits unique safety weaknesses of dLLMs. Specifically, our proposed DIJA constructs adversarial interleaved mask-text prompts that exploit the text generation mechanisms of dLLMs, i.e., bidirectional modeling and parallel decoding. Bidirectional modeling drives the model to produce contextually consistent outputs for masked spans, even when harmful, while parallel decoding limits model dynamic filtering and rejection sampling of unsafe content. This causes standard alignment mechanisms to fail, enabling harmful completions in alignment-tuned dLLMs, even when harmful behaviors or unsafe instructions are directly exposed in the prompt. Through comprehensive experiments, we demonstrate that DIJA significantly outperforms existing jailbreak methods, exposing a previously overlooked threat surface in dLLM architectures. Notably, our method achieves up to 100% keyword-based ASR on Dream-Instruct, surpassing the strongest prior baseline, ReNeLLM, by up to 78.5% in evaluator-based ASR on JailbreakBench and by 37.7 points in StrongREJECT score, while requiring no rewriting or hiding of harmful content in the jailbreak prompt. Our findings underscore the urgent need for rethinking safety alignment in this emerging class of language models. Code is available at https://github.com/ZichenWen1/DIJA.


Mario at EXIST 2025: A Simple Gateway to Effective Multilingual Sexism Detection

arXiv.org Artificial Intelligence

This paper presents our approach to EXIST 2025 Task 1, addressing text-based sexism detection in English and Spanish tweets through hierarchical Low-Rank Adaptation (LoRA) of Llama 3.1 8B. Our method introduces conditional adapter routing that explicitly models label dependencies across three hierarchically structured subtasks: binary sexism identification, source intention detection, and multilabel sexism categorization. Unlike conventional LoRA applications that target only attention layers, we apply adaptation to all linear transformations, enhancing the model's capacity to capture task-specific patterns. In contrast to complex data processing and ensemble approaches, we show that straightforward parameter-efficient fine-tuning achieves strong performance. We train separate LoRA adapters (rank=16, QLoRA 4-bit) for each subtask using unified multilingual training that leverages Llama 3.1's native bilingual capabilities. The method requires minimal preprocessing and uses standard supervised learning. Our multilingual training strategy eliminates the need for separate language-specific models, achieving 1.7-2.4\% F1 improvements through cross-lingual transfer. With only 1.67\% trainable parameters compared to full fine-tuning, our approach reduces training time by 75\% and model storage by 98\%, while achieving competitive performance across all subtasks (ICM-Hard: 0.6774 for binary classification, 0.4991 for intention detection, 0.6519 for multilabel categorization).


How to Protect Models against Adversarial Unlearning?

arXiv.org Artificial Intelligence

AI models need to be unlearned to fulfill the requirements of legal acts such as the AI Act or GDPR, and also because of the need to remove toxic content, debiasing, the impact of malicious instances, or changes in the data distribution structure in which a model works. Unfortunately, removing knowledge may cause undesirable side effects, such as a deterioration in model performance. In this paper, we investigate the problem of adversarial unlearning, where a malicious party intentionally sends unlearn requests to deteriorate the model's performance maximally. We show that this phenomenon and the adversary's capabilities depend on many factors, primarily on the backbone model itself and strategy/limitations in selecting data to be unlearned. The main result of this work is a new method of protecting model performance from these side effects, both in the case of unlearned behavior resulting from spontaneous processes and adversary actions.


AF-XRAY: Visual Explanation and Resolution of Ambiguity in Legal Argumentation Frameworks

arXiv.org Artificial Intelligence

Argumentation frameworks (AFs) provide formal approaches for legal reasoning, but identifying sources of ambiguity and explaining argument acceptance remains challenging for non-experts. We present AF-XRAY, an open-source toolkit for exploring, analyzing, and visualizing abstract AFs in legal reasoning. AF-XRAY introduces: (i) layered visualizations based on game-theoretic argument length revealing well-founded derivation structures; (ii) classification of attack edges by semantic roles (primary, secondary, blunders); (iii) overlay visualizations of alternative 2-valued solutions on ambiguous 3-valued grounded semantics; and (iv) identification of critical attack sets whose suspension resolves undecided arguments. Through systematic generation of critical attack sets, AF-XRAY transforms ambiguous scenarios into grounded solutions, enabling users to pinpoint specific causes of ambiguity and explore alternative resolutions. We use real-world legal cases (e.g., Wild Animals as modeled by Bench-Capon) to show that our tool supports teleological legal reasoning by revealing how different assumptions lead to different justified conclusions.


The Shape of Deceit: Behavioral Consistency and Fragility in Money Laundering Patterns

arXiv.org Artificial Intelligence

Conventional anti-money laundering (AML) systems predominantly focus on identifying anomalous entities or transactions, flagging them for manual investigation based on statistical deviation or suspicious behavior. This paradigm, however, misconstrues the true nature of money laundering, which is rarely anomalous but often deliberate, repeated, and concealed within consistent behavioral routines. In this paper, we challenge the entity-centric approach and propose a network-theoretic perspective that emphasizes detecting predefined laundering patterns across directed transaction networks. We introduce the notion of behavioral consistency as the core trait of laundering activity, and argue that such patterns are better captured through subgraph structures expressing semantic and functional roles - not solely geometry. Crucially, we explore the concept of pattern fragility: the sensitivity of laundering patterns to small attribute changes and, conversely, their semantic robustness even under drastic topological transformations. We claim that laundering detection should not hinge on statistical outliers, but on preservation of behavioral essence, and propose a reconceptualization of pattern similarity grounded in this insight. This philosophical and practical shift has implications for how AML systems model, scan, and interpret networks in the fight against financial crime.


Transforming Sensitive Documents into Quantitative Data: An AI-Based Preprocessing Toolchain for Structured and Privacy-Conscious Analysis

arXiv.org Artificial Intelligence

Unstructured text from legal, medical, and administrative sources offers a rich but underutilized resource for research in public health and the social sciences. However, large-scale analysis is hampered by two key challenges: the presence of sensitive, personally identifiable information, and significant heterogeneity in structure and language. We present a modular toolchain that prepares such text data for embedding-based analysis, relying entirely on open-weight models that run on local hardware, requiring only a workstation-level GPU and supporting privacy-sensitive research. The toolchain employs large language model (LLM) prompting to standardize, summarize, and, when needed, translate texts to English for greater comparability. Anonymization is achieved via LLM-based redaction, supplemented with named entity recognition and rule-based methods to minimize the risk of disclosure. We demonstrate the toolchain on a corpus of 10,842 Swedish court decisions under the Care of Abusers Act (LVM), comprising over 56,000 pages. Each document is processed into an anonymized, standardized summary and transformed into a document-level embedding. Validation, including manual review, automated scanning, and predictive evaluation shows the toolchain effectively removes identifying information while retaining semantic content. As an illustrative application, we train a predictive model using embedding vectors derived from a small set of manually labeled summaries, demonstrating the toolchain's capacity for semi-automated content analysis at scale. By enabling structured, privacy-conscious analysis of sensitive documents, our toolchain opens new possibilities for large-scale research in domains where textual data was previously inaccessible due to privacy and heterogeneity constraints.


AI and disinformation fuel political rivalries in the Philippines

Al Jazeera

Manila, Philippines โ€“ When former Philippines President Rodrigo Duterte was arrested by the International Criminal Court (ICC) in March, Sheerah Escuerdo spoke to a local television station, welcoming the politician's detention on charges of murder linked to his war on drugs. Escuerdo, who lost her 18-year-old brother, Ephraim, to Duterte's war, clutched a portrait of her sibling during the interview with News 5 Everywhere as she demanded justice for his killing. Days later, she was shocked to find an AI-generated video of her slain brother circulating on Facebook, in which he said he was alive and accused his sister of lying. Are they paying you to do this?" the computer-generated image of Ephraim said. The video, posted online by a pro-Duterte influencer with 11,000 followers, immediately drew thousands of views on Facebook. One of the comments read, "Fake drug war victims". It was Escudero and her brother's image from her News 5 Everywhere interview that the influencer had used to ...


Cartel drones pose 'dangerous' drug trafficking risk in border state, official warns

FOX News

Arizona Attorney General Kris Mayes explains how drones are frequently used at the southern border to transport drugs, raising concerns from both sides of the aisle. As reported crossings have dropped dramatically at the border, there is still work to be done on matters of stopping drugs from making their way into the United States, especially in the border state of Arizona, a top state official says. One of the ways that cartels transport drugs is by using drones, a tactic that gained attention after bipartisan legislation signed in the Grand Canyon State gave law enforcement the power to shoot down the small aircraft. "I think what has changed is that we have gotten more control over people crossing over the border, but unfortunately what has not changed is we still have a huge amount of fentanyl that is coming across our border here in Arizona, and that is being flown over the by the Mexican drug cartels with drones," Democratic Arizona Attorney General Kris Mayes said. Fentanyl is being delivered across the border by cartels on drones.


AI chatbot 'MechaHitler' could be making content considered violent extremism, expert witness tells X v eSafety case

The Guardian

The chatbot embedded in Elon Musk's X that referred to itself as "MechaHitler" and made antisemitic comments last week could be considered terrorism or violent extremism content, an Australian tribunal has heard. But an expert witness for X has argued a large language model cannot be ascribed intent, only the user. The outburst came into focus at an administrative review tribunal hearing on Tuesday where X is challenging a notice issued by the eSafety commissioner, Julie Inman Grant, in March last year asking the platform to explain how it is taking action against terrorism and violent extremism (TVE) material. X's expert witness, RMIT economics professor Chris Berg, provided evidence to the case that it was an error to assume a large language model can produce such content, because it is the intent of the user prompting the large language model that is critical in defining what can be considered terrorism and violent extremism content. One of eSafety's expert witnesses, Queensland University of Technology law professor Nicolas Suzor, disagreed with Berg, stating it was "absolutely possible for chatbots, generative AI and other tools to have some role in producing so-called synthetic TVE".