AITopics | Law

Collaborating Authors

Law

OpenAI completes conversion to for-profit business after lengthy legal saga

The GuardianOct-28-2025, 15:10:05 GMT

Sam Altman speaks in San Francisco on 2 June 2025. Sam Altman speaks in San Francisco on 2 June 2025. OpenAI said on Tuesday it had converted its main business into a for-profit corporation, the conclusion of a lengthy and fraught legal saga. A crucial regulator, Kathy Jennings, the Delaware attorney general, said she approved the plan for the startup, which began as a non-profit in 2015, to change to a public benefit corporation, a type of for-profit entity that expresses commitment to bettering society. The company also said it had reorganized its ownership structure and signed a new agreement with its longtime backer Microsoft that gives the software giant a roughly 27% stake in OpenAI's new for-profit corporation, but changes some of the details of their close partnership.

for-profit business, openai, openai complete conversion, (8 more...)

The Guardian

Country:

North America > United States > California > San Francisco County > San Francisco (0.46)
North America > United States > Delaware (0.25)
Europe > Ukraine (0.07)
Oceania > Australia (0.05)

Industry:

Law (1.00)
Leisure & Entertainment > Sports (0.72)
Government > Regional Government > North America Government > United States Government (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.98)

Add feedback

Google hit with lawsuit over AI 'hallucinations' linking conservative activist to child abuse claims

FOX NewsOct-28-2025, 12:10:16 GMT

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by Refinitiv Lipper .

google, robby starbuck, starbuck, (10 more...)

FOX News

Country:

North America > United States > New York (0.05)
North America > United States > Delaware (0.05)

Industry:

Media (1.00)
Leisure & Entertainment > Sports (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (0.75)
Information Technology > Artificial Intelligence > Natural Language (0.70)

Add feedback

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Cheng, Ziheng, Huang, Yixiao, Xu, Hui, Sojoudi, Somayeh, Zhao, Xuandong, Song, Dawn, Mei, Song

arXiv.org Artificial IntelligenceOct-28-2025

Text-to-Image (T2I) models have achieved remarkable success in generating visual content from text inputs. Although multiple safety alignment strategies have been proposed to prevent harmful outputs, they often lead to overly cautious behavior -- rejecting even benign prompts -- a phenomenon known as $\textit{over-refusal}$ that reduces the practical utility of T2I models. Despite over-refusal having been observed in practice, there is no large-scale benchmark that systematically evaluates this phenomenon for T2I models. In this paper, we present an automatic workflow to construct synthetic evaluation data, resulting in OVERT ($\textbf{OVE}$r-$\textbf{R}$efusal evaluation on $\textbf{T}$ext-to-image models), the first large-scale benchmark for assessing over-refusal behaviors in T2I models. OVERT includes 4,600 seemingly harmful but benign prompts across nine safety-related categories, along with 1,785 genuinely harmful prompts (OVERT-unsafe) to evaluate the safety-utility trade-off. Using OVERT, we evaluate several leading T2I models and find that over-refusal is a widespread issue across various categories (Figure 1), underscoring the need for further research to enhance the safety alignment of T2I models without compromising their functionality. As a preliminary attempt to reduce over-refusal, we explore prompt rewriting; however, we find it often compromises faithfulness to the meaning of the original prompts. Finally, we demonstrate the flexibility of our generation framework in accommodating diverse safety requirements by generating customized evaluation data adapting to user-defined policies.

category, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.21347

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.69)
Law (0.69)
Leisure & Entertainment > Games > Computer Games (0.46)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)

Add feedback

PatenTEB: A Comprehensive Benchmark and Model Family for Patent Text Embedding

Ayaou, Iliass, Cavallucci, Denis

arXiv.org Artificial IntelligenceOct-28-2025

Patent text embeddings enable prior art search, technology landscaping, and patent analysis, yet existing benchmarks inadequately capture patent-specific challenges. We introduce PatenTEB, a comprehensive benchmark comprising 15 tasks across retrieval, classification, paraphrase, and clustering, with 2.06 million examples. PatenTEB employs domain-stratified splits, domain specific hard negative mining, and systematic coverage of asymmetric fragment-to-document matching scenarios absent from general embedding benchmarks. We develop the patembed model family through multi-task training, spanning 67M to 344M parameters with context lengths up to 4096 tokens. External validation shows strong generalization: patembed-base achieves state-of-the-art on MTEB BigPatentClustering.v2 (0.494 V-measure vs. 0.445 previous best), while patembed-large achieves 0.377 NDCG@100 on DAPFAM. Systematic ablations reveal that multi-task training improves external generalization despite minor benchmark costs, and that domain-pretrained initialization provides consistent advantages across task families. All resources will be made available at https://github.com/iliass-y/patenteb. Keywords: patent retrieval, sentence embeddings, multi-task learning, asymmetric retrieval, benchmark evaluation, contrastive learning.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.22264

Country: Europe > France (0.86)

Genre: Research Report > New Finding (0.46)

Industry: Law > Intellectual Property & Technology Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Cross-Lingual Cost: Retrieval Biases in RAG over Arabic-English Corpora

Amiraz, Chen, Fyodorov, Yaroslav, Haramaty, Elad, Karnin, Zohar, Lewin-Eytan, Liane

arXiv.org Artificial IntelligenceOct-28-2025

Cross-lingual retrieval-augmented generation (RAG) is a critical capability for retrieving and generating answers across languages. Prior work in this context has mostly focused on generation and relied on benchmarks derived from open-domain sources, most notably Wikipedia. In such settings, retrieval challenges often remain hidden due to language imbalances, overlap with pretraining data, and memorized content. To address this gap, we study Arabic-English RAG in a domain-specific setting using benchmarks derived from real-world corporate datasets. Our benchmarks include all combinations of languages for the user query and the supporting document, drawn independently and uniformly at random. This enables a systematic study of multilingual retrieval behavior. Our findings reveal that retrieval is a critical bottleneck in cross-lingual domain-specific scenarios, with substantial performance drops occurring when the user query and supporting document languages differ. A key insight is that these failures stem primarily from the retriever's difficulty in ranking documents across languages. Finally, we propose two simple retrieval strategies that address this source of failure by enforcing equal retrieval from both languages or by translating the query, resulting in substantial improvements in cross-lingual and overall performance. These results highlight meaningful opportunities for improving multilingual retrieval, particularly in practical, real-world RAG applications.

benchmark, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.07543

Genre: Research Report > New Finding (1.00)

Industry: Law (0.31)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)

Add feedback

Preference Optimization by Estimating the Ratio of the Data Distribution

Kim, Yeongmin, Bae, Heesun, Na, Byeonghu, Moon, Il-Chul

arXiv.org Artificial IntelligenceOct-28-2025

Direct preference optimization (DPO) is widely used as a simple and stable method for aligning large language models (LLMs) with human preferences. This paper investigates a generalized DPO loss that enables a policy model to match the target policy from a likelihood ratio estimation perspective. The ratio of the target policy provides a unique identification of the policy distribution without relying on reward models or partition functions. This allows the generalized loss to retain both simplicity and theoretical guarantees, which prior work such as $f$-PO fails to achieve simultaneously. We propose Bregman preference optimization (BPO), a generalized framework for ratio matching that provides a family of objective functions achieving target policy optimality. BPO subsumes DPO as a special case and offers tractable forms for all instances, allowing implementation with a few lines of code. We further develop scaled Basu's power divergence (SBA), a gradient scaling method that can be used for BPO instances. The BPO framework complements other DPO variants and is applicable to target policies defined by these variants. In experiments, unlike other probabilistic loss extensions such as $f$-DPO or $f$-PO, which exhibit a trade-off between generation fidelity and diversity, instances of BPO improve both win rate and entropy compared with DPO. When applied to Llama-3-8B-Instruct, BPO achieves state-of-the-art performance among Llama-3-8B backbones, with a 55.9\% length-controlled win rate on AlpacaEval2. Project page: https://github.com/aailab-kaist/BPO.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.19601

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
Government > Tax (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FedSVD: Adaptive Orthogonalization for Private Federated Learning with LoRA

Lee, Seanie, Park, Sangwoo, Lee, Dong Bok, Wagner, Dominik, Seong, Haebin, Bocklet, Tobias, Lee, Juho, Hwang, Sung Ju

arXiv.org Artificial IntelligenceOct-28-2025

Low-Rank Adaptation (LoRA), which introduces a product of two trainable low-rank matrices into frozen pre-trained weights, is widely used for efficient fine-tuning of language models in federated learning (FL). However, when combined with differentially private stochastic gradient descent (DP-SGD), LoRA faces substantial noise amplification: DP-SGD perturbs per-sample gradients, and the matrix multiplication of the LoRA update ($BA$) intensifies this effect. Freezing one matrix (e.g., $A$) reduces the noise but restricts model expressiveness, often resulting in suboptimal adaptation. To address this, we propose $\texttt{FedSVD}$, a simple yet effective method that introduces a global reparameterization based on singular value decomposition (SVD). In our approach, each client optimizes only the $B$ matrix and transmits it to the server. The server aggregates the $B$ matrices, computes the product $BA$ using the previous $A$, and refactorizes the result via SVD. This yields a new adaptive $A$ composed of the orthonormal right singular vectors of $BA$, and an updated $B$ containing the remaining SVD components. This reparameterization avoids quadratic noise amplification, while allowing $A$ to better capture the principal directions of the aggregate updates. Moreover, the orthonormal structure of $A$ bounds the gradient norms of $B$ and preserves more signal under DP-SGD, as confirmed by our theoretical analysis. As a result, $\texttt{FedSVD}$ consistently improves stability and performance across a variety of privacy settings and benchmarks, outperforming relevant baselines under both private and non-private regimes.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.12805

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Assessing the Potential of Generative Agents in Crowdsourced Fact-Checking

Costabile, Luigia, Orlando, Gian Marco, La Gatta, Valerio, Moscato, Vincenzo

arXiv.org Artificial IntelligenceOct-28-2025

The growing spread of online misinformation has created an urgent need for scalable, reliable fact-checking solutions. Crowdsourced fact-checking - where non-experts evaluate claim veracity - offers a cost-effective alternative to expert verification, despite concerns about variability in quality and bias. Encouraged by promising results in certain contexts, major platforms such as X (formerly Twitter), Facebook, and Instagram have begun shifting from centralized moderation to decentralized, crowd-based approaches. In parallel, advances in Large Language Models (LLMs) have shown strong performance across core fact-checking tasks, including claim detection and evidence evaluation. However, their potential role in crowdsourced workflows remains unexplored. This paper investigates whether LLM-powered generative agents - autonomous entities that emulate human behavior and decision-making - can meaningfully contribute to fact-checking tasks traditionally reserved for human crowds. Using the protocol of La Barbera et al. (2024), we simulate crowds of generative agents with diverse demographic and ideological profiles. Agents retrieve evidence, assess claims along multiple quality dimensions, and issue final veracity judgments. Our results show that agent crowds outperform human crowds in truthfulness classification, exhibit higher internal consistency, and show reduced susceptibility to social and cognitive biases. Compared to humans, agents rely more systematically on informative criteria such as Accuracy, Precision, and Informativeness, suggesting a more structured decision-making process. Overall, our findings highlight the potential of generative agents as scalable, consistent, and less biased contributors to crowd-based fact-checking systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.osnem.2025.100326

2504.1994

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine (0.93)
(2 more...)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SAGE: A Generic Framework for LLM Safety Evaluation

Jindal, Madhur, Shrawgi, Hari, Agrawal, Parag, Dandapat, Sandipan

arXiv.org Artificial IntelligenceOct-28-2025

As Large Language Models are rapidly deployed across diverse applications from healthcare to financial advice, safety evaluation struggles to keep pace. Current benchmarks focus on single-turn interactions with generic policies, failing to capture the conversational dynamics of real-world usage and the application-specific harms that emerge in context. Such potential oversights can lead to harms that go unnoticed in standard safety benchmarks and other current evaluation methodologies. To address these needs for robust AI safety evaluation, we introduce SAGE (Safety AI Generic Evaluation), an automated modular framework designed for customized and dynamic harm evaluations. SAGE employs prompted adversarial agents with diverse personalities based on the Big Five model, enabling system-aware multi-turn conversations that adapt to target applications and harm policies. We evaluate seven state-of-the-art LLMs across three applications and harm policies. Multi-turn experiments show that harm increases with conversation length, model behavior varies significantly when exposed to different user personalities and scenarios, and some models minimize harm via high refusal rates that reduce usefulness. We also demonstrate policy sensitivity within a harm category where tightening a child-focused sexual policy substantially increases measured defects across applications. These results motivate adaptive, policy-aware, and context-specific testing for safer real-world deployment.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.19674

Country: Asia (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Government (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reduced AI Acceptance After the Generative AI Boom: Evidence From a Two-Wave Survey Study

Baumann, Joachim, Urman, Aleksandra, Leicht-Deobald, Ulrich, Roman, Zachary J., Hannák, Anikó, Christen, Markus

arXiv.org Artificial IntelligenceOct-28-2025

The rapid adoption of generative artificial intelligence (GenAI) technologies has led many organizations to integrate AI into their products and services, often without considering user preferences. Yet, public attitudes toward AI use, especially in impactful decision-making scenarios, are underexplored. Using a large-scale two-wave survey study (n_wave1=1514, n_wave2=1488) representative of the Swiss population, we examine shifts in public attitudes toward AI before and after the launch of ChatGPT. We find that the GenAI boom is significantly associated with reduced public acceptance of AI (see Figure 1) and increased demand for human oversight in various decision-making contexts. The proportion of respondents finding AI "not acceptable at all" increased from 23% to 30%, while support for human-only decision-making rose from 18% to 26%. These shifts have amplified existing social inequalities in terms of widened educational, linguistic, and gender gaps post-boom. Our findings challenge industry assumptions about public readiness for AI deployment and highlight the critical importance of aligning technological development with evolving public preferences.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.23578

Country:

Europe (1.00)
North America > United States > New York (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)

Industry:

Media > News (1.00)
Law (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback