Law
The Risk-Adjusted Intelligence Dividend: A Quantitative Framework for Measuring AI Return on Investment Integrating ISO 42001 and Regulatory Exposure
Organizations investing in artificial intelligence face a fundamental challenge: traditional return on investment calculations fail to capture the dual nature of AI implementations, which simultaneously reduce certain operational risks while introducing novel exposures related to algorithmic malfunction, adversarial attacks, and regulatory liability. This research presents a comprehensive financial framework for quantifying AI project returns that explicitly integrates changes in organizational risk profiles. The methodology addresses a critical gap in current practice where investment decisions rely on optimistic benefit projections without accounting for the probabilistic costs of AI-specific threats including model drift, bias-related litigation, and compliance failures under emerging regulations such as the European Union Artificial Intelligence Act and ISO/IEC 42001. Drawing on established risk quantification methods, including annual loss expectancy calculations and Monte Carlo simulation techniques, this framework enables practitioners to compute net benefits that incorporate both productivity gains and the delta between pre-implementation and post-implementation risk exposures. The analysis demonstrates that accurate AI investment evaluation requires explicit modeling of control effectiveness, reserve requirements for algorithmic failures, and the ongoing operational costs of maintaining model performance. Practical implications include specific guidance for establishing governance structures, conducting phased validations, and integrating risk-adjusted metrics into capital allocation decisions, ultimately enabling evidence-based AI portfolio management that satisfies both fiduciary responsibilities and regulatory mandates.
Standardized Threat Taxonomy for AI Security, Governance, and Regulatory Compliance
The accelerating deployment of artificial intelligence systems across regulated sectors has exposed critical fragmentation in risk assessment methodologies. A significant "language barrier" currently separates technical security teams, who focus on algorithmic vulnerabilities (e.g., MITRE ATLAS), from legal and compliance professionals, who address regulatory mandates (e.g., EU AI Act, NIST AI RMF). This disciplinary disconnect prevents the accurate translation of technical vulnerabilities into financial liability, leaving practitioners unable to answer fundamental economic questions regarding contingency reserves, control return-on-investment, and insurance exposure. To bridge this gap, this research presents the AI System Threat Vector Taxonomy, a structured ontology designed explicitly for Quantitative Risk Assessment (QRA). The framework categorizes AI-specific risks into nine critical domains: Misuse, Poisoning, Privacy, Adversarial, Biases, Unreliable Outputs, Drift, Supply Chain, and IP Threat, integrating 53 operationally defined sub-threats. Uniquely, each domain maps technical vectors directly to business loss categories (Confidentiality, Integrity, Availability, Legal, Reputation), enabling the translation of abstract threats into measurable financial impact. The taxonomy is empirically validated through an analysis of 133 documented AI incidents from 2025 (achieving 100% classification coverage) and reconciled against the main AI risk frameworks. Furthermore, it is explicitly aligned with ISO/IEC 42001 controls and NIST AI RMF functions to facilitate auditability.
Invited to Develop: Institutional Belonging and the Counterfactual Architecture of Development
This paper examines how institutional belonging shapes long-term development by comparing Spain and Uruguay, two small democracies with similar historical endowments whose trajectories diverged sharply after the 1960s. While Spain integrated into dense European institutional architectures, Uruguay remained embedded within the Latin American governance regime, characterized by weaker coordination and lower institutional coherence. To assess how alternative institutional embeddings could have altered these paths, the study develops a generative counterfactual framework grounded in economic complexity, institutional path dependence, and a Wasserstein GAN trained on data from 1960-2020. The resulting Expected Developmental Shift (EDS) quantifies structural gains or losses from hypothetical re-embedding in different institutional ecosystems. Counterfactual simulations indicate that Spain would have experienced significant developmental decline under a Latin American configuration, while Uruguay would have achieved higher complexity and resilience within a European regime. These findings suggest that development is not solely determined by domestic reforms but emerges from a country's structural position within transnational institutional networks.
Tacit Bidder-Side Collusion: Artificial Intelligence in Dynamic Auctions
We study whether large language models acting as autonomous bidders can tacitly collude by coordinating when to accept platform posted payouts in repeated Dutch auctions, without any communication. We present a minimal repeated auction model that yields a simple incentive compatibility condition and a closed form threshold for sustainable collusion for subgame-perfect Nash equilibria. In controlled simulations with multiple language models, we observe systematic supra-competitive prices in small auction settings and a return to competitive behavior as the number of bidders in the market increases, consistent with the theoretical model. We also find LLMs use various mechanisms to facilitate tacit coordination, such as focal point acceptance timing versus patient strategies that track the theoretical incentives. The results provide, to our knowledge, the first evidence of bidder side tacit collusion by LLMs and show that market structure levers can be more effective than capability limits for mitigation.
PromptTailor: Multi-turn Intent-Aligned Prompt Synthesis for Lightweight LLMs
Lightweight language models remain attractive for on-device and privacy-sensitive applications, but their responses are highly sensitive to prompt quality. For open-ended generation, non-expert users often lack the knowledge or time to consistently craft high-quality prompts, leading them to rely on prompt optimization tools. However, a key challenge is ensuring the optimized prompts genuinely align with users' original intents and preferences. We introduce PromptTailor, a system for controllable prompt generation for open-ended text that improves model output quality by intent-aligned prompt synthesis. PromptTailor expands minimal user instructions into rich, domain-aware prompts while preserving the user's stated preferences. The system is a quantized Llama3-8B model fine-tuned with a lightweight LoRA adapter on 12,300 prompt-refinement dialogues spanning 41 everyday domains, distilled from three stronger LLMs. The adapter attaches to any Llama3-8B base, enabling edge deployment. In human and LLM-judge evaluations across multiple target models and optimization baselines, PromptTailor yields higher preference rates than chain-of-thought prompting and matches or surpasses state-of-the-art prompt optimization methods while requiring fewer model calls (e.g., 3 vs. 9). These results show that a compact student, guided by powerful teachers, can learn effective prompt-generation strategies that enhance response quality while maintaining alignment with user intent.
When Harmless Words Harm: A New Threat to LLM Safety via Conceptual Triggers
Zhang, Zhaoxin, Chen, Borui, Hu, Yiming, Qu, Youyang, Zhu, Tianqing, Gao, Longxiang
Recent research on large language model (LLM) jailbreaks has primarily focused on techniques that bypass safety mechanisms to elicit overtly harmful outputs. However, such efforts often overlook attacks that exploit the model's capacity for abstract generalization, creating a critical blind spot in current alignment strategies. This gap enables adversaries to induce objectionable content by subtly manipulating the implicit social values embedded in model outputs. In this paper, we introduce MICM, a novel, model-agnostic jailbreak method that targets the aggregate value structure reflected in LLM responses. Drawing on conceptual morphology theory, MICM encodes specific configurations of nuanced concepts into a fixed prompt template through a predefined set of phrases. These phrases act as conceptual triggers, steering model outputs toward a specific value stance without triggering conventional safety filters. We evaluate MICM across five advanced LLMs, including GPT-4o, Deepseek-R1, and Qwen3-8B. Experimental results show that MICM consistently outperforms state-of-the-art jailbreak techniques, achieving high success rates with minimal rejection. Our findings reveal a critical vulnerability in commercial LLMs: their safety mechanisms remain susceptible to covert manipulation of underlying value alignment.
EvalCards: A Framework for Standardized Evaluation Reporting
Dhar, Ruchira, Villegas, Danae Sanchez, Karamolegkou, Antonia, Schiavone, Alice, Yuan, Yifei, Chen, Xinyi, Li, Jiaang, Frank, Stella, De Grazia, Laura, Swain, Monorama, Brandl, Stephanie, Hershcovich, Daniel, Sรธgaard, Anders, Elliott, Desmond
Evaluation has long been a central concern in NLP, and transparent reporting practices are more critical than ever in today's landscape of rapidly released open-access models. Drawing on a survey of recent work on evaluation and documentation, we identify three persistent shortcomings in current reporting practices: reproducibility, accessibility, and governance. We argue that existing standardization efforts remain insufficient and introduce Evaluation Disclosure Cards (EvalCards) as a path forward. EvalCards are designed to enhance transparency for both researchers and practitioners while providing a practical foundation to meet emerging governance requirements.
Local Hybrid Retrieval-Augmented Document QA
Organizations handling sensitive documents face a critical dilemma: adopt cloud-based AI systems that offer powerful question-answering capabilities but compromise data privacy, or maintain local processing that ensures security but delivers poor accuracy. We present a question-answering system that resolves this trade-off by combining semantic understanding with keyword precision, operating entirely on local infrastructure without internet access. Our approach demonstrates that organizations can achieve competitive accuracy on complex queries across legal, scientific, and conversational documents while keeping all data on their machines. By balancing two complementary retrieval strategies and using consumer-grade hardware acceleration, the system delivers reliable answers with minimal errors, letting banks, hospitals, and law firms adopt conversational document AI without transmitting proprietary information to external providers. This work establishes that privacy and performance need not be mutually exclusive in enterprise AI deployment.
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
Xiong, Zixun, Wu, Gaoyi, Yu, Qingyang, Ma, Mingyu Derek, Yao, Lingfeng, Pan, Miao, Du, Xiaojiang, Wang, Hao
Given the high cost of large language model (LLM) training from scratch, safeguarding LLM intellectual property (IP) has become increasingly crucial. As the standard paradigm for IP ownership verification, LLM fingerprinting thus plays a vital role in addressing this challenge. Existing LLM fingerprinting methods verify ownership by extracting or injecting model-specific features. However, they overlook potential attacks during the verification process, leaving them ineffective when the model thief fully controls the LLM's inference process. In such settings, attackers may share prompt-response pairs to enable fingerprint unlearning, or manipulate outputs to evade exact-match verification. We propose iSeal, the first fingerprinting method designed for reliable verification when the model thief controls the suspected LLM in an end-to-end manner. It injects unique features into both the model and an external module, reinforced by an error-correction mechanism and a similarity-based verification strategy. These components are resistant to verification-time attacks, including collusion-based fingerprint unlearning and response manipulation, backed by both theoretical analysis and empirical results.
Mina: A Multilingual LLM-Powered Legal Assistant Agent for Bangladesh for Empowering Access to Justice
Wasi, Azmine Toushik, Faisal, Wahid, Islam, Mst Rafia
Bangladesh's low-income population faces major barriers to affordable legal advice due to complex legal language, procedural opacity, and high costs. Existing AI legal assistants lack Bengali-language support and jurisdiction-specific adaptation, limiting their effectiveness. To address this, we developed Mina, a multilingual LLM-based legal assistant tailored for the Bangladeshi context. It employs multilingual embeddings and a RAG-based chain-of-tools framework for retrieval, reasoning, translation, and document generation, delivering context-aware legal drafts, citations, and plain-language explanations via an interactive chat interface. Evaluated by law faculty from leading Bangladeshi universities across all stages of the 2022 and 2023 Bangladesh Bar Council Exams, Mina scored 75-80% in Preliminary MCQs, Written, and simulated Viva Voce exams, matching or surpassing average human performance and demonstrating clarity, contextual understanding, and sound legal reasoning. Even under a conservative upper bound, Mina operates at just 0.12-0.61% of typical legal consultation costs in Bangladesh, yielding a 99.4-99.9\% cost reduction relative to human-provided services. These results confirm its potential as a low-cost, multilingual AI assistant that automates key legal tasks and scales access to justice, offering a real-world case study on building domain-specific, low-resource systems and addressing challenges of multilingual adaptation, efficiency, and sustainable public-service AI deployment.