Goto

Collaborating Authors

 Law


Crypto and big tech's backing pays off as Trump makes tech-friendly moves

The Guardian

The millions that US tech companies invested in currying favor with Donald Trump seemed to pay off this week as the new administration issued a flurry of directives that relaxed regulations and dropped lawsuits previously aimed at holding the industry to account. Crypto, AI and social media companies, many of which made donations to Trump, are all expecting to benefit. At the center of the administration's moves is Elon Musk, the world's richest man. Over the past week, federal agencies under the president's authority dropped legal fights against his rocket company and the US's biggest cryptocurrency exchange. The White House also issued a "deregulatory initiative" aimed at loosening tech-sector regulation by empowering Musk's Doge.


9th Circuit clears Grindr, dating app for gay men, in child sex trafficking case

Los Angeles Times

Grindr, the dating app that caters to gay men, cannot be held responsible for the rape of a 15-year-old boy who the company matched with sexual predators, the U.S. 9th Circuit Court of Appeals ruled this week; it is the latest teens-versus-tech spat in a fight over internet immunity experts say could soon come before the U.S. Supreme Court. The appellate court's ruling upheld a 2023 decision by U.S. District Judge Otis D. Wright II of the Central District of California, who dismissed the suit, saying Grindr was shielded by broad immunity protections passed almost a decade before the plaintiff was born. In a series of events Wright called "alarming and tragic," a closeted Nova Scotia teen downloaded the LGBTQ hookup app in an attempt to meet other gay kids in his rural Canadian town. Instead, over the course of four days, he was assaulted by four adult men, including a man who picked him up after the teen sent him pictures from his high school cafeteria. LGBTQ social networking platform Grindr last year told its all-remote staff they had to return to the office or lose their jobs.


Interrogating LLM design under a fair learning doctrine

arXiv.org Artificial Intelligence

The current discourse on large language models (LLMs) and copyright largely takes a "behavioral" perspective, focusing on model outputs and evaluating whether they are substantially similar to training data. However, substantial similarity is difficult to define algorithmically and a narrow focus on model outputs is insufficient to address all copyright risks. In this interdisciplinary work, we take a complementary "structural" perspective and shift our focus to how LLMs are trained. We operationalize a notion of "fair learning" by measuring whether any training decision substantially affected the model's memorization. As a case study, we deconstruct Pythia, an open-source LLM, and demonstrate the use of causal and correlational analyses to make factual determinations about Pythia's training decisions. By proposing a legal standard for fair learning and connecting memorization analyses to this standard, we identify how judges may advance the goals of copyright law through adjudication. Finally, we discuss how a fair learning standard might evolve to enhance its clarity by becoming more rule-like and incorporating external technical guidelines.


Machine Learning-Based Cloud Computing Compliance Process Automation

arXiv.org Artificial Intelligence

Cloud computing adoption across industries has revolutionized enterprise operations while introducing significant challenges in compliance management. Organizations must continuously meet evolving regulatory requirements such as GDPR and ISO 27001, yet traditional manual review processes have become increasingly inadequate for modern business scales. This paper presents a novel machine learning-based framework for automating cloud computing compliance processes, addressing critical challenges including resource-intensive manual reviews, extended compliance cycles, and delayed risk identification. Our proposed framework integrates multiple machine learning technologies, including BERT-based document processing (94.5% accuracy), One-Class SVM for anomaly detection (88.7% accuracy), and an improved CNN-LSTM architecture for sequential compliance data analysis (90.2% accuracy). Implementation results demonstrate significant improvements: reducing compliance process duration from 7 days to 1.5 days, improving accuracy from 78% to 93%, and decreasing manual effort by 73.3%. A real-world deployment at a major securities firm validated these results, processing 800,000 daily transactions with 94.2% accuracy in risk identification.


ADAPT Centre Contribution on Implementation of the EU AI Act and Fundamental Right Protection

arXiv.org Artificial Intelligence

The EU AI Act introduces a blanket protection of fundamental rights for specific applications of AI that it classifies as high-risk, which is implemented under the existing single market harmonised product certification mechanisms for health and safety protection, i.e. the New Legislative Framework. This protection of fundamental rights places many AI issues previously covered by voluntary trustworthy or ethical AI frameworks into a framework with independent and legally binding accountability for harmful characteristics of products grounded in the same human rights framework underpinning Union Law and many national laws. However, this major change in accountability also introduces many legal uncertainties on how AI providers and deployers can identify and manage risks to fundamental rights. Contrast this to the introduction of GDPR, which focussed on the protection of rights of privacy and data protection but benefitted from the development and employment of data protection principles under the data protection directive which had been in force beforehand. The protection of fundamental rights in AI systems however benefits from no such breakdown of principle, nor from prior deployment or compliance experience with such principles. This presents an extremely high level of legal uncertainty for providers and deployers of AI systems once the Act comes into force. The associated burden or chilling effects may fall disproportionately on public bodies wishing to deploy and reap the benefits of AI in high risk areas, and indigenous companies and especially SMEs that wish to market products into such applications.


A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

arXiv.org Artificial Intelligence

This study investigates the machine unlearning techniques within the context of large language models (LLMs), referred to as \textit{LLM unlearning}. LLM unlearning offers a principled approach to removing the influence of undesirable data (e.g., sensitive or illegal information) from LLMs, while preserving their overall utility without requiring full retraining. Despite growing research interest, there is no comprehensive survey that systematically organizes existing work and distills key insights; here, we aim to bridge this gap. We begin by introducing the definition and the paradigms of LLM unlearning, followed by a comprehensive taxonomy of existing unlearning studies. Next, we categorize current unlearning approaches, summarizing their strengths and limitations. Additionally, we review evaluation metrics and benchmarks, providing a structured overview of current assessment methodologies. Finally, we outline promising directions for future research, highlighting key challenges and opportunities in the field.


A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety

arXiv.org Artificial Intelligence

This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For the definitive version, see 10.1109/ACCESS.2025.3539933. Disclaimer: This research involves topics that may include disturbing results. Any explicit content has been redacted, and potentially disturbing results have been presented in a neutral and anonymized manner to minimize emotional distress to the readers. Abstract --Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts and the quality, coverage, and distribution of harm types in the datasets. By highlighting the strengths and limitations of the datasets, this study enables researchers to find the most ...


Analyzing User Perceptions of Large Language Models (LLMs) on Reddit: Sentiment and Topic Modeling of ChatGPT and DeepSeek Discussions

arXiv.org Artificial Intelligence

While there is an increased discourse on large language models (LLMs) like ChatGPT and DeepSeek, there is no comprehensive understanding of how users of online platforms, like Reddit, perceive these models. This is an important omission because public opinion can influence AI development, trust, and future policy. This study aims at analyzing Reddit discussions about ChatGPT and DeepSeek using sentiment and topic modeling to advance the understanding of user attitudes. Some of the significant topics such as trust in AI, user expectations, potential uses of the tools, reservations about AI biases, and ethical implications of their use are explored in this study. By examining these concerns, the study provides a sense of how public sentiment might shape the direction of AI development going forward. The report also mentions whether users have faith in the technology and what they see as its future. A word frequency approach is used to identify broad topics and sentiment trends. Also, topic modeling through the Latent Dirichlet Allocation (LDA) method identifies top topics in users' language, for example, potential benefits of LLMs, their technological applications, and their overall social ramifications. The study aims to inform developers and policymakers by making it easier to see how users comprehend and experience these game-changing technologies.


Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

arXiv.org Artificial Intelligence

Conversational agents are increasingly woven into individuals' personal lives, yet users often underestimate the privacy risks involved. The moment users share information with these agents (e.g., LLMs), their private information becomes vulnerable to exposure. In this paper, we characterize the notion of contextual privacy for user interactions with LLMs. It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals when interacting with LLMs (untrusted receivers). Through a formative design user study, we observe how even "privacy-conscious" users inadvertently reveal sensitive information through indirect disclosures. Based on insights from this study, we propose a locally-deployable framework that operates between users and LLMs, and identifies and reformulates out-of-context information in user prompts. Our evaluation using examples from ShareGPT shows that lightweight models can effectively implement this framework, achieving strong gains in contextual privacy while preserving the user's intended interaction goals through different approaches to classify information relevant to the intended goals.


Verifying Classification with Limited Disclosure

arXiv.org Artificial Intelligence

We consider the multi-party classification problem introduced by Dong, Hartline, and Vijayaraghavan (2022) motivated by electronic discovery. In this problem, our goal is to design a protocol that guarantees the requesting party receives nearly all responsive documents while minimizing the disclosure of nonresponsive documents. We develop verification protocols that certify the correctness of a classifier by disclosing a few nonresponsive documents. We introduce a combinatorial notion called the Leave-One-Out dimension of a family of classifiers and show that the number of nonresponsive documents disclosed by our protocol is at most this dimension in the realizable setting, where a perfect classifier exists in this family. For linear classifiers with a margin, we characterize the trade-off between the margin and the number of nonresponsive documents that must be disclosed for verification. Specifically, we establish a trichotomy in this requirement: for $d$ dimensional instances, when the margin exceeds $1/3$, verification can be achieved by revealing only $O(1)$ nonresponsive documents; when the margin is exactly $1/3$, in the worst case, at least $\Omega(d)$ nonresponsive documents must be disclosed; when the margin is smaller than $1/3$, verification requires $\Omega(e^d)$ nonresponsive documents. We believe this result is of independent interest with applications to coding theory and combinatorial geometry. We further extend our protocols to the nonrealizable setting defining an analogous combinatorial quantity robust Leave-One-Out dimension, and to scenarios where the protocol is tolerant to misclassification errors by Alice.