Goto

Collaborating Authors

 ledger


Ideal Attribution and Faithful Watermarks for Language Models

Song, Min Jae, Shahabi, Kameron

arXiv.org Machine Learning

We introduce ideal attribution mechanisms, a formal abstraction for reasoning about attribution decisions over strings. At the core of this abstraction lies the ledger, an append-only log of the prompt-response interaction history between a model and its user. Each mechanism produces deterministic decisions based on the ledger and an explicit selection criterion, making it well-suited to serve as a ground truth for attribution. We frame the design goal of watermarking schemes as faithful representation of ideal attribution mechanisms. This novel perspective brings conceptual clarity, replacing piecemeal probabilistic statements with a unified language for stating the guarantees of each scheme. It also enables precise reasoning about desiderata for future watermarking schemes, even when no current construction achieves them, since the ideal functionalities are specified first. In this way, the framework provides a roadmap that clarifies which guarantees are attainable in an idealized setting and worth pursuing in practice.


Dissecting the Ledger: Locating and Suppressing "Liar Circuits" in Financial Large Language Models

Mirajkar, Soham

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed in high-stakes financial domains, yet they suffer from specific, reproducible hallucinations when performing arithmetic operations. Current mitigation strategies often treat the model as a black box. In this work, we propose a mechanistic approach to intrinsic hallucination detection. By applying Causal Tracing to the GPT-2 XL architecture on the ConvFinQA benchmark, we identify a dual-stage mechanism for arithmetic reasoning: a distributed computational scratchpad in middle layers (L12-L30) and a decisive aggregation circuit in late layers (specifically Layer 46). We verify this mechanism via an ablation study, demonstrating that suppressing Layer 46 reduces the model's confidence in hallucinatory outputs by 81.8%. Furthermore, we demonstrate that a linear probe trained on this layer generalizes to unseen financial topics with 98% accuracy, suggesting a universal geometry of arithmetic deception.


Ordered Consensus with Equal Opportunity

Zhang, Yunhao, Ni, Haobin, Basu, Soumya, Cohen, Shir, Yin, Maofan, Alvisi, Lorenzo, van Renesse, Robbert, Chen, Qi, Zhou, Lidong

arXiv.org Artificial Intelligence

The specification of state machine replication (SMR) has no requirement on the final total order of commands. In blockchains based on SMR, however, order matters, since different orders could provide their clients with different financial rewards. Ordered consensus augments the specification of SMR to include specific guarantees on such order, with a focus on limiting the influence of Byzantine nodes. Real-world ordering manipulations, however, can and do happen even without Byzantine replicas, typically because of factors, such as faster networks or closer proximity to the blockchain infrastructure, that give some clients an unfair advantage. To address this challenge, this paper proceeds to extend ordered consensus by requiring it to also support equal opportunity, a concrete notion of fairness, widely adopted in social sciences. Informally, equal opportunity requires that two candidates who, according to a set of criteria deemed to be relevant, are equally qualified for a position (in our case, a specific slot in the SMR total order), should have an equal chance of landing it. We show how randomness can be leveraged to keep bias in check, and, to this end, introduce the secret random oracle (SRO), a system component that generates randomness in a fault-tolerant manner. We describe two SRO designs based, respectively, on trusted hardware and threshold verifiable random functions, and instantiate them in Bercow, a new ordered consensus protocol that, by approximating equal opportunity up to within a configurable factor, can effectively mitigate well-known ordering attacks in SMR-based blockchains.


Analysing semantic data storage in Distributed Ledger Technologies for Data Spaces

Cano-Benito, Juan, Cimmino, Andrea, Hertling, Sven, Paulheim, Heiko, García-Castro, Raúl

arXiv.org Artificial Intelligence

Data spaces are emerging as decentralised infrastructures that enable sovereign, secure, and trustworthy data exchange among multiple participants. To achieve semantic interoperability within these environments, the use of semantic web technologies and knowledge graphs has been proposed. Although distributed ledger technologies (DLT) fit as the underlying infrastructure for data spaces, there remains a significant gap in terms of the efficient storage of semantic data on these platforms. This paper presents a systematic evaluation of semantic data storage across different types of DLT (public, private, and hybrid), using a real-world knowledge graph as an experimental basis. The study compares performance, storage efficiency, resource consumption, and the capabilities to update and query semantic data. The results show that private DLTs are the most efficient for storing and managing semantic content, while hybrid DLTs offer a balanced trade-off between public auditability and operational efficiency. This research leads to a discussion on the selection of the most appropriate DLT infrastructure based on the data sovereignty requirements of decentralised data ecosystems.


Peer Review as Structured Commentary: Immutable Identity, Public Dialogue, and Reproducible Scholarship

Wright, Craig Steven

arXiv.org Artificial Intelligence

This paper reconceptualises peer review as structured public commentary. Traditional academic validation is hindered by anonymity, latency, and gatekeeping. We propose a transparent, identity-linked, and reproducible system of scholarly evaluation anchored in open commentary. Leveraging blockchain for immutable audit trails and AI for iterative synthesis, we design a framework that incentivises intellectual contribution, captures epistemic evolution, and enables traceable reputational dynamics. This model empowers fields from computational science to the humanities, reframing academic knowledge as a living process rather than a static credential.


On Immutable Memory Systems for Artificial Agents: A Blockchain-Indexed Automata-Theoretic Framework Using ECDH-Keyed Merkle Chains

Wright, Craig Steven

arXiv.org Artificial Intelligence

This paper presents a formalised architecture for synthetic agents designed to retain immutable memory, verifiable reasoning, and constrained epistemic growth. Traditional AI systems rely on mutable, opaque statistical models prone to epistemic drift and historical revisionism. In contrast, we introduce the concept of the Merkle Automaton, a cryptographically anchored, deterministic computational framework that integrates formal automata theory with blockchain-based commitments. Each agent transition, memory fragment, and reasoning step is committed within a Merkle structure rooted on-chain, rendering it non-repudiable and auditably permanent. To ensure selective access and confidentiality, we derive symmetric encryption keys from ECDH exchanges contextualised by hierarchical privilege lattices. This enforces cryptographic access control over append-only DAG-structured knowledge graphs. Reasoning is constrained by formal logic systems and verified through deterministic traversal of policy-encoded structures. Updates are non-destructive and historied, preserving epistemic lineage without catastrophic forgetting. Zero-knowledge proofs facilitate verifiable, privacy-preserving inclusion attestations. Collectively, this architecture reframes memory not as a cache but as a ledger - one whose contents are enforced by protocol, bound by cryptography, and constrained by formal logic. The result is not an intelligent agent that mimics thought, but an epistemic entity whose outputs are provably derived, temporally anchored, and impervious to post hoc revision. This design lays foundational groundwork for legal, economic, and high-assurance computational systems that require provable memory, unforgeable provenance, and structural truth.


Herd Routes: A Preventative IoT-Based System for Improving Female Pedestrian Safety on City Streets

Woodburn, Madeleine, Griggs, Wynita M., Marecek, Jakub, Shorten, Robert N.

arXiv.org Artificial Intelligence

--Over two thirds of women of all ages in the UK have experienced some form of sexual harassment in a public space. Recent tragic incidents involving female pedestrians have highlighted some of the personal safety issues that women still face in cities today. There exist many popular location-based safety applications as a result of this; however, these applications tend to take a reactive approach where action is taken only after an incident has occurred. This paper proposes a preventative approach to the problem by creating safer public environments through societal incentivisation. The proposed system, called "Herd Routes ", improves the safety of female pedestrians by generating busier pedestrian routes as a result of route incen-tivisation. A novel application of distributed ledgers is proposed to provide security and trust, a record of system users' locations and IDs, and a platform for token exchange. A proof-of-concept was developed using the simulation package SUMO (Simulation of Urban Mobility), and a smartphone app. With positive results from the initial testing of the proof-of-concept, further development could significantly contribute towards creating safer pedestrian routes through cities, and tackle the societal change that is required to improve female pedestrian safety in the long term. Emales of all ages face gender-inequities in every day life, and the associated feelings of compromised safety and fearfulness that can arise. Of course, in these situations, women do as much as they can to prioritise their personal safety. Notably, women approach walking through cities with extreme caution, especially at night. In London, for example, there are ongoing initiatives such as the UN Women's Global initiative of "Safe Cities and Safe Public Spaces for Women and Girls", which commits to identifying gender-responsive, locally relevant and owned interventions [1].


Transforming Triple-Entry Accounting with Machine Learning: A Path to Enhanced Transparency Through Analytics

Weinberg, Abraham Itzhak, Faccia, Alessio

arXiv.org Artificial Intelligence

Triple Entry (TE) is an accounting method that utilizes three accounts or 'entries' to record each transaction, rather than the conventional double-entry bookkeeping system. Existing studies have found that TE accounting, with its additional layer of verification and disclosure of inter-organizational relationships, could help improve transparency in complex financial and supply chain transactions such as blockchain. Machine learning (ML) presents a promising avenue to augment the transparency advantages of TE accounting. By automating some of the data collection and analysis needed for TE bookkeeping, ML techniques have the potential to make this more transparent accounting method scalable for large organizations with complex international supply chains, further enhancing the visibility and trustworthiness of financial reporting. By leveraging ML algorithms, anomalies within distributed ledger data can be swiftly identified, flagging potential instances of fraud or errors. Furthermore, by delving into transaction relationships over time, ML can untangle intricate webs of transactions, shedding light on obscured dealings and adding an investigative dimension. This paper aims to demonstrate the interaction between TE and ML and how they can leverage transparency levels.


Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks

Fourney, Adam, Bansal, Gagan, Mozannar, Hussein, Tan, Cheng, Salinas, Eduardo, Erkang, null, Zhu, null, Niedtner, Friederike, Proebsting, Grace, Bassman, Griffin, Gerrits, Jack, Alber, Jacob, Chang, Peter, Loynd, Ricky, West, Robert, Dibia, Victor, Awadallah, Ahmed, Kamar, Ece, Hosn, Rafah, Amershi, Saleema

arXiv.org Artificial Intelligence

Modern AI agents, driven by advances in large foundation models, promise to enhance our productivity and transform our lives by augmenting our knowledge and capabilities. To achieve this vision, AI agents must effectively plan, perform multi-step reasoning and actions, respond to novel observations, and recover from errors, to successfully complete complex tasks across a wide range of scenarios. In this work, we introduce Magentic-One, a high-performing open-source agentic system for solving such tasks. Magentic-One uses a multi-agent architecture where a lead agent, the Orchestrator, plans, tracks progress, and re-plans to recover from errors. Throughout task execution, the Orchestrator directs other specialized agents to perform tasks as needed, such as operating a web browser, navigating local files, or writing and executing Python code. We show that Magentic-One achieves statistically competitive performance to the state-of-the-art on three diverse and challenging agentic benchmarks: GAIA, AssistantBench, and WebArena. Magentic-One achieves these results without modification to core agent capabilities or to how they collaborate, demonstrating progress towards generalist agentic systems. Moreover, Magentic-One's modular design allows agents to be added or removed from the team without additional prompt tuning or training, easing development and making it extensible to future scenarios. We provide an open-source implementation of Magentic-One, and we include AutoGenBench, a standalone tool for agentic evaluation. AutoGenBench provides built-in controls for repetition and isolation to run agentic benchmarks in a rigorous and contained manner -- which is important when agents' actions have side-effects. Magentic-One, AutoGenBench and detailed empirical performance evaluations of Magentic-One, including ablations and error analysis are available at https://aka.ms/magentic-one


Confidential Federated Computations

Eichner, Hubert, Ramage, Daniel, Bonawitz, Kallista, Huba, Dzmitry, Santoro, Tiziano, McLarnon, Brett, Van Overveldt, Timon, Fallen, Nova, Kairouz, Peter, Cheu, Albert, Daly, Katharine, Gascon, Adria, Gruteser, Marco, McMahan, Brendan

arXiv.org Artificial Intelligence

Since its introduction in 2017 [48, 42], federated learning (FL) has seen adoption by technology platforms working with private on-device data (cross-device federated learning) or proprietary server-side data (crosssilo federated learning). FL's appeal has been driven by its straightforward privacy advantages: raw data stays in the control of participating entities, with only focused updates sent for immediate aggregation, visible to the service provider. Systems that realize federated learning [18, 35, 51] run at scale today, reducing privacy risks in sensitive applications like mobile keyboards [33, 63, 21, 53] and voice assistants [12, 34]. However, basic federated learning offers an incomplete privacy story [19]: updates sent to the service provider can reveal private data unless updates are aggregated obliviously, and aggregated updates can encode individual data unless trained with a differentially private (DP) learning algorithm [30]. A dishonest service provider might log or inspect unaggregated messages, from which a great deal of information about an individual participant can be learned [15, 57]. This risk has been addressed with oblivious aggregation schemes that guarantee the service provider cannot inspect unaggregated messages, including secure multiparty computation (SMPC) from cohorts of honest devices [17], non-colluding SMPC-based secure aggregators [58], or hardware trusted execution environments (TEEs) [35].