Goto

Collaborating Authors

 Law


Minimizing Risk Through Minimizing Model-Data Interaction: A Protocol For Relying on Proxy Tasks When Designing Child Sexual Abuse Imagery Detection Models

arXiv.org Artificial Intelligence

The distribution of child sexual abuse imagery (CSAI) is an ever-growing concern of our modern world; children who suffered from this heinous crime are revictimized, and the growing amount of illegal imagery distributed overwhelms law enforcement agents (LEAs) with the manual labor of categorization. To ease this burden researchers have explored methods for automating data triage and detection of CSAI, but the sensitive nature of the data imposes restricted access and minimal interaction between real data and learning algorithms, avoiding leaks at all costs. In observing how these restrictions have shaped the literature we formalize a definition of "Proxy Tasks", i.e., the substitute tasks used for training models for CSAI without making use of CSA data. Under this new terminology we review current literature and present a protocol for making conscious use of Proxy Tasks together with consistent input from LEAs to design better automation in this field. Finally, we apply this protocol to study -- for the first time -- the task of Few-shot Indoor Scene Classification on CSAI, showing a final model that achieves promising results on a real-world CSAI dataset whilst having no weights actually trained on sensitive data.


Engineering Risk-Aware, Security-by-Design Frameworks for Assurance of Large-Scale Autonomous AI Models

arXiv.org Artificial Intelligence

As AI models scale to billions of parameters and operate with increasing autonomy, ensuring their safe, reliable operation demands engineering-grade security and assurance frameworks. This paper presents an enterprise-level, risk-aware, security-by-design approach for large-scale autonomous AI systems, integrating standardized threat metrics, adversarial hardening techniques, and real-time anomaly detection into every phase of the development lifecycle. We detail a unified pipeline - from design-time risk assessments and secure training protocols to continuous monitoring and automated audit logging - that delivers provable guarantees of model behavior under adversarial and operational stress. Case studies in national security, open-source model governance, and industrial automation demonstrate measurable reductions in vulnerability and compliance overhead. Finally, we advocate cross-sector collaboration - uniting engineering teams, standards bodies, and regulatory agencies - to institutionalize these technical safeguards within a resilient, end-to-end assurance ecosystem for the next generation of AI.


Enterprise Architecture as a Dynamic Capability for Scalable and Sustainable Generative AI adoption: Bridging Innovation and Governance in Large Organisations

arXiv.org Artificial Intelligence

Generative Artificial Intelligence is a powerful new technology with the potential to boost innovation and reshape governance in many industries. Nevertheless, organisations face major challenges in scaling GenAI, including technology complexity, governance gaps and resource misalignments. This study explores how Enterprise Architecture Management can meet the complex requirements of GenAI adoption within large enterprises. Based on a systematic literature review and the qualitative analysis of 16 semi-structured interviews with experts, it examines the relationships between EAM, dynamic capabilities and GenAI adoption. The review identified key limitations in existing EA frameworks, particularly their inability to fully address the unique requirements of GenAI. The interviews, analysed using the Gioia methodology, revealed critical enablers and barriers to GenAI adoption across industries. The findings indicate that EAM, when theorised as sensing, seizing and transforming dynamic capabilities, can enhance GenAI adoption by improving strategic alignment, governance frameworks and organisational agility. However, the study also highlights the need to tailor EA frameworks to GenAI-specific challenges, including low data governance maturity and the balance between innovation and compliance. Several conceptual frameworks are proposed to guide EA leaders in aligning GenAI maturity with organisational readiness. The work contributes to academic understanding and industry practice by clarifying the role of EA in bridging innovation and governance in disruptive technology environments.


DMRL: Data- and Model-aware Reward Learning for Data Extraction

arXiv.org Artificial Intelligence

Large language models (LLMs) are inherently vulnerable to unintended privacy breaches. Consequently, systematic red-teaming research is essential for developing robust defense mechanisms. However, current data extraction methods suffer from several limitations: (1) base on dataset duplicates (addressable via deduplication), (2) depend on prompt engineering (now countered by detection and defense), and (3) reliance on random-search adversarial generation. To address these challenges, we propose DMRL: Data-and Model-aware Reward Learning for data extraction, a novel technique that leverages inverse reinforcement learning to extract sensitive data from LLMs. Our approach consists of two main components: (1) construction of a introspective reasoning dataset that encapsulates leakage mindset to guide model behavior; and (2) training a reward models with Group Relative Policy Optimization (GRPO), dynamically tuning optimization to task difficulty at both the data and model levels. Comprehensive experiments across various LLMs demonstrate that DMRL outperforms all baseline methods in data extraction performance.


PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model

arXiv.org Artificial Intelligence

Multi-objective test-time alignment aims to adapt large language models (LLMs) to diverse multi-dimensional user preferences during inference while keeping LLMs frozen. Recently, GenARM (Xu et al., 2025) first independently trains Autoregressive Reward Models (ARMs) for each preference dimension without awareness of each other, then combines their outputs based on user-specific preference vectors during inference to achieve multi-objective test-time alignment, leading to two key limitations: the need for \textit{multiple} ARMs increases the inference cost, and the separate training of ARMs causes the misalignment between the guided generation and the user preferences. To address these issues, we propose Preference-aware ARM (PARM), a single unified ARM trained across all preference dimensions. PARM uses our proposed Preference-Aware Bilinear Low-Rank Adaptation (PBLoRA), which employs a bilinear form to condition the ARM on preference vectors, enabling it to achieve precise control over preference trade-offs during inference. Experiments demonstrate that PARM reduces inference costs and achieves better alignment with preference vectors compared with existing methods. Additionally, PARM enables weak-to-strong guidance, allowing a smaller PARM to guide a larger frozen LLM without expensive training, making multi-objective alignment accessible with limited computing resources. The code is available at https://github.com/Baijiong-Lin/PARM.


Modeling supply chain compliance response strategies based on AI synthetic data with structural path regression: A Simulation Study of EU 2027 Mandatory Labor Regulations

arXiv.org Artificial Intelligence

In the context of the new mandatory labor compliance in the European Union (EU), which will be implemented in 2027, supply chain enterprises face stringent working hour management requirements and compliance risks. In order to scientifically predict the enterprises' coping behaviors and performance outcomes under the policy impact, this paper constructs a methodological framework that integrates the AI synthetic data generation mechanism and structural path regression modeling to simulate the enterprises' strategic transition paths under the new regulations. In terms of research methodology, this paper adopts high-quality simulation data generated based on Monte Carlo mechanism and NIST synthetic data standards to construct a structural path analysis model that includes multiple linear regression, logistic regression, mediation effect and moderating effect. The variable system covers 14 indicators such as enterprise working hours, compliance investment, response speed, automation level, policy dependence, etc. The variable set with explanatory power is screened out through exploratory data analysis (EDA) and VIF multicollinearity elimination. The findings show that compliance investment has a significant positive impact on firm survival and its effect is transmitted through the mediating path of the level of intelligence; meanwhile, firms' dependence on the EU market significantly moderates the strength of this mediating effect. It is concluded that AI synthetic data combined with structural path modeling provides an effective tool for high-intensity regulatory simulation, which can provide a quantitative basis for corporate strategic response, policy design and AI-assisted decision-making in the pre-prediction stage lacking real scenario data. Keywords: AI synthetic data, structural path regression modeling, compliance response strategy, EU 2027 mandatory labor regulation


Apple to pay out nearly 100m over claims phones listened in on users' conversations... how to get a payout

Daily Mail - Science & tech

Anyone who owned an Apple device over the last decade may be able to claim part of a 95 million class action lawsuit against the tech giant. According to the lawsuit, iPhones, iPads, Apple Watches, and MacBooks dating back to 2014 may have secretly recorded their users' private conversations after the devices unintentionally activated Apple's voice assistant Siri. A notice about the case, Lopez v. Apple, has advised anyone who believes Siri spied on their confidential or private calls between September 17, 2014 and December 31, 2024 to submit a claim for damages. Apple's iMacs, Apple TV streaming boxes, HomePod speakers, and iPod Touches are also included in the lawsuit. Although Apple has denied that their devices spied on users, the 3 trillion company reached a settlement in the case, agreeing to give users up to 20 per Siri device in their claim.


How a new type of AI is helping police skirt facial recognition bans

MIT Technology Review

"The whole vision behind Track in the first place," says Veritone CEO Ryan Steelberg, was "if we're not allowed to track people's faces, how do we assist in trying to potentially identify criminals or malicious behavior or activity?" In addition to tracking individuals where facial recognition isn't legally allowed, Steelberg says, it allows for tracking when faces are obscured or not visible. The product has drawn criticism from the American Civil Liberties Union, which--after learning of the tool through MIT Technology Review--said it was the first instance they'd seen of a nonbiometric tracking system used at scale in the US. They warned that it raises many of the same privacy concerns as facial recognition but also introduces new ones at a time when the Trump administration is pushing federal agencies to ramp up monitoring of protesters, immigrants, and students. Veritone gave us a demonstration of Track in which it analyzed people in footage from different environments, ranging from the January 6 riots to subway stations.


Trump admin fires top US copyright official days after terminating Librarian of Congress

FOX News

An AI art lecturer said he believes the U.S. government would encounter difficulty if it attempted to establish a watermark system for AI-generated content. Trump fired Librarian of Congress Carla Hayden, who was the first woman and first African American to be Librarian of Congress, on Thursday. The termination was part of the administration's ongoing purge of government officials who are perceived to be opposed to Trump and his agenda. The White House did not immediately respond to Fox News Digital's requests for comment on the matter. Like Perlmutter, Hayden was notified of her firing in an email, according to The Associated Press.


Mixed-Integer Optimization for Responsible Machine Learning

arXiv.org Machine Learning

In the last few decades, Machine Learning (ML) has achieved significant success across domains ranging from healthcare, sustainability, and the social sciences, to criminal justice and finance. But its deployment in increasingly sophisticated, critical, and sensitive areas affecting individuals, the groups they belong to, and society as a whole raises critical concerns around fairness, transparency, robustness, and privacy, among others. As the complexity and scale of ML systems and of the settings in which they are deployed grow, so does the need for responsible ML methods that address these challenges while providing guaranteed performance in deployment. Mixed-integer optimization (MIO) offers a powerful framework for embedding responsible ML considerations directly into the learning process while maintaining performance. For example, it enables learning of inherently transparent models that can conveniently incorporate fairness or other domain specific constraints. This tutorial paper provides an accessible and comprehensive introduction to this topic discussing both theoretical and practical aspects. It outlines some of the core principles of responsible ML, their importance in applications, and the practical utility of MIO for building ML models that align with these principles. Through examples and mathematical formulations, it illustrates practical strategies and available tools for efficiently solving MIO problems for responsible ML. It concludes with a discussion on current limitations and open research questions, providing suggestions for future work.