AITopics

Retrieval-augmented generation (RAG) enhances Large Language Models (LLMs) by mitigating hallucinations and outdated information issues, yet simultaneously facilitates unauthorized data appropriation at scale. This paper addresses this challenge through two key contributions. First, we introduce RPD, a novel dataset specifically designed for RAG plagiarism detection that encompasses diverse professional domains and writing styles, overcoming limitations in existing resources. Second, we develop a dual-layered watermarking system that embeds protection at both semantic and lexical levels, complemented by an interrogator-detective framework that employs statistical hypothesis testing on accumulated evidence. Extensive experimentation demonstrates our approach's effectiveness across varying query volumes, defense prompts, and retrieval parameters, while maintaining resilience against adversarial evasion techniques. This work establishes a foundational framework for intellectual property protection in retrieval-augmented AI systems.

large language model, machine learning, natural language, (20 more...)

2510.07728

Country:

Asia > China (0.46)
Asia > Middle East > UAE (0.28)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Banking Done Right: Redefining Retail Banking with Language-Centric AI

Chua, Xin Jie, Tan, Jeraelyn Ming Li, Tan, Jia Xuan, Poh, Soon Chang, Goh, Yi Xian, Choong, Debbie Hui Tian, Foong, Chee Mun, Yang, Sze Jue, Chan, Chee Seng

This paper presents Ryt AI, an LLM-native agentic framework that powers Ryt Bank to enable customers to execute core financial transactions through natural language conversation. This represents the first global regulator-approved deployment worldwide where conversational AI functions as the primary banking interface, in contrast to prior assistants that have been limited to advisory or support roles. Built entirely in-house, Ryt AI is powered by ILMU, a closed-source LLM developed internally, and replaces rigid multi-screen workflows with a single dialogue orchestrated by four LLM-powered agents (Guardrails, Intent, Payment, and FAQ). Each agent attaches a task-specific LoRA adapter to ILMU, which is hosted within the bank's infrastructure to ensure consistent behavior with minimal overhead. Deterministic guardrails, human-in-the-loop confirmation, and a stateless audit architecture provide defense-in-depth for security and compliance. The result is Banking Done Right: demonstrating that regulator-approved natural-language interfaces can reliably support core financial operations under strict governance.

large language model, machine learning, natural language, (22 more...)

2510.07645

Country: Asia > Malaysia (0.15)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rajgarhia, Harshit, Gupta, Suryam, Shaik, Asif, Kumar, Gulipalli Praveen, Santhoshraj, Y, Nishitha, Sanka Nithya Tanvy, Mukherji, Abhishek

An Evaluation Study of Hybrid Methods for Multilingual PII Detection

The detection of Personally Identifiable Information (PII) is critical for privacy compliance but remains challenging in low-resource languages due to linguistic diversity and limited annotated data. We present RECAP, a hybrid framework that combines deterministic regular expressions with context-aware large language models (LLMs) for scalable PII detection across 13 low-resource locales. RECAP's modular design supports over 300 entity types without retraining, using a three-phase refinement pipeline for disambiguation and filtering. Benchmarked with nervaluate, our system outperforms fine-tuned NER models by 82% and zero-shot LLMs by 17% in weighted F1-score. This work offers a scalable and adaptable solution for efficient PII detection in compliance-focused applications.

large language model, machine learning, natural language, (16 more...)

2510.07551

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.68)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing

Hughes, Anthony, Duddu, Vasisht, Asokan, N., Aletras, Nikolaos, Ma, Ning

Language models (LMs) may memorize personally identifiable information (PII) from training data, enabling adversaries to extract it during inference. Existing defense mechanisms such as differential privacy (DP) reduce this leakage, but incur large drops in utility. Based on a comprehensive study using circuit discovery to identify the computational circuits responsible PII leakage in LMs, we hypothesize that specific PII leakage circuits in LMs should be responsible for this behavior. Therefore, we propose PATCH (Privacy-Aware Targeted Circuit PatcHing), a novel approach that first identifies and subsequently directly edits PII circuits to reduce leakage. PATCH achieves better privacy-utility trade-off than existing defenses, e.g., reducing recall of PII leakage from LMs by up to 65%. Finally, PATCH can be combined with DP to reduce recall of residual leakage of an LM to as low as 0.01%. Our analysis shows that PII leakage circuits persist even after the application of existing defense mechanisms. In contrast, PATCH can effectively mitigate their impact.

computational linguistic, large language model, machine learning, (17 more...)

2510.07452

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Inconsistent Affective Reaction: Sentiment of Perception and Opinion in Urban Environments

Huang, Jingfei, Tu, Han

The ascension of social media platforms has transformed our understanding of urban environments, giving rise to nuanced variations in sentiment reaction embedded within human perception and opinion, and challenging existing multidimensional sentiment analysis approaches in urban studies. This study presents novel methodologies for identifying and elucidating sentiment inconsistency, constructing a dataset encompassing 140,750 Baidu and Tencent Street view images to measure perceptions, and 984,024 Weibo social media text posts to measure opinions. A reaction index is developed, integrating object detection and natural language processing techniques to classify sentiment in Beijing Second Ring for 2016 and 2022. Classified sentiment reaction is analysed and visualized using regression analysis, image segmentation, and word frequency based on land-use distribution to discern underlying factors. The perception affective reaction trend map reveals a shift toward more evenly distributed positive sentiment, while the opinion affective reaction trend map shows more extreme changes. Our mismatch map indicates significant disparities between the sentiments of human perception and opinion of urban areas over the years. Changes in sentiment reactions have significant relationships with elements such as dense buildings and pedestrian presence. Our inconsistent maps present perception and opinion sentiments before and after the pandemic and offer potential explanations and directions for environmental management, in formulating strategies for urban renewal.

artificial intelligence, machine learning, natural language, (13 more...)

doi: 10.52842/conf.caadria.2024.2.395

2510.07359

Country:

North America > United States > Massachusetts (0.28)
Asia > China > Beijing > Beijing (0.26)

Genre: Research Report > New Finding (0.69)

Industry:

Law (0.49)
Health & Medicine (0.30)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Benchmarking LLM Causal Reasoning with Scientifically Validated Relationships

Lee, Donggyu, Park, Sungwon, Hwang, Yerin, Kim, Hyoshin, Oh, Hyunwoo, Kim, Jungwon, Cha, Meeyoung, Park, Sangyoon, Kim, Jihee

Causal reasoning is fundamental for Large Language Models (LLMs) to understand genuine cause-and-effect relationships beyond pattern matching. Existing benchmarks suffer from critical limitations such as reliance on synthetic data and narrow domain coverage. We introduce a novel benchmark constructed from casually identified relationships extracted from top-tier economics and finance journals, drawing on rigorous methodologies including instrumental variables, difference-in-differences, and regression discontinuity designs. Our benchmark comprises 40,379 evaluation items covering five task types across domains such as health, environment, technology, law, and culture. Experimental results on eight state-of-the-art LLMs reveal substantial limitations, with the best model achieving only 57.6\% accuracy. Moreover, model scale does not consistently translate to superior performance, and even advanced reasoning models struggle with fundamental causal relationship identification. These findings underscore a critical gap between current LLM capabilities and demands of reliable causal reasoning in high-stakes applications.

large language model, machine learning, natural language, (18 more...)

2510.07231

Country:

Asia (0.46)
North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Law (0.67)
Banking & Finance > Economy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Buscemi, Alessio, Simonetto, Thibault, Pagani, Daniele, Castignani, German, Cordy, Maxime, Cabot, Jordi

The Sandbox Configurator: A Framework to Support Technical Assessment in AI Regulatory Sandboxes

The systematic assessment of AI systems is increasingly vital as these technologies enter high-stakes domains. To address this, the EU's Artificial Intelligence Act introduces AI Regulatory Sandboxes (AIRS): supervised environments where AI systems can be tested under the oversight of Competent Authorities (CAs), balancing innovation with compliance, particularly for startups and SMEs. Yet significant challenges remain: assessment methods are fragmented, tests lack standardisation, and feedback loops between developers and regulators are weak. To bridge these gaps, we propose the Sandbox Configurator, a modular open-source framework that enables users to select domain-relevant tests from a shared library and generate customised sandbox environments with integrated dashboards. Its plug-in architecture aims to support both open and proprietary modules, fostering a shared ecosystem of interoperable AI assessment services. The framework aims to address multiple stakeholders: CAs gain structured workflows for applying legal obligations; technical experts can integrate robust evaluation methods; and AI providers access a transparent pathway to compliance. By promoting cross-border collaboration and standardisation, the Sandbox Configurator's goal is to support a scalable and innovation-friendly European infrastructure for trustworthy AI governance.

artificial intelligence, configurator, sandbox configurator, (17 more...)

2509.25256

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Workflow (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)

Cho, Eunjung, Hoyle, Alexander, Hermstrüwer, Yoan

Modeling Motivated Reasoning in Law: Evaluating Strategic Role Conditioning in LLM Summarization

Large Language Models (LLMs) are increasingly used to generate user-tailored summaries, adapting outputs to specific stakeholders. In legal contexts, this raises important questions about motivated reasoning -- how models strategically frame information to align with a stakeholder's position within the legal system. Building on theories of legal realism and recent trends in legal practice, we investigate how LLMs respond to prompts conditioned on different legal roles (e.g., judges, prosecutors, attorneys) when summarizing judicial decisions. We introduce an evaluation framework grounded in legal fact and reasoning inclusion, also considering favorability towards stakeholders. Our results show that even when prompts include balancing instructions, models exhibit selective inclusion patterns that reflect role-consistent perspectives. These findings raise broader concerns about how similar alignment may emerge as LLMs begin to infer user roles from prior interactions or context, even without explicit role instructions. Our results underscore the need for role-aware evaluation of LLM summarization behavior in high-stakes legal settings.

large language model, machine learning, natural language, (20 more...)

2509.00529

Country: Europe > Switzerland (0.92)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Criminal Law (0.70)
Law > Litigation (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Urchs, Stefanie, Thurner, Veronika, Aßenmacher, Matthias, Heumann, Christian, Thiemichen, Stephanie

Fair Play in the Newsroom: Actor-Based Filtering Gender Discrimination in Text Corpora

Language corpora are the foundation of most natural language processing research, yet they often reproduce structural inequalities. One such inequality is gender discrimination in how actors are represented, which can distort analyses and perpetuate discriminatory outcomes. This paper introduces a user-centric, actor-level pipeline for detecting and mitigating gender discrimination in large-scale text corpora. By combining discourse-aware analysis with metrics for sentiment, syntactic agency, and quotation styles, our method enables both fine-grained auditing and exclusion-based balancing. Applied to the taz2024full corpus of German newspaper articles (1980-2024), the pipeline yields a more gender-balanced dataset while preserving core dynamics of the source material. Our findings show that structural asymmetries can be reduced through systematic filtering, though subtler biases in sentiment and framing remain. We release the tools and reports to support further research in discourse-based fairness auditing and equitable corpus construction.

actor, machine learning, natural language, (17 more...)