Goto

Collaborating Authors

 consensus


The Best of Both Worlds in Network Population Games: Reaching Consensus & Convergence to Equilibrium

Neural Information Processing Systems

Reaching consensus and convergence to equilibrium are two major challenges of multi-agent systems. Although each has attracted significant attention, relatively few studies address both challenges at the same time. This paper examines the connection between the notions of consensus and equilibrium in a multi-agent system where multiple interacting sub-populations coexist. We argue that consensus can be seen as an intricate component of intra-population stability, whereas equilibrium can be seen as encoding inter-population stability. We show that smooth fictitious play, a well-known learning model in game theory, can achieve both consensus and convergence to equilibrium in diverse multi-agent settings. Moreover, we show that the consensus formation process plays a crucial role in the seminal thorny problem of equilibrium selection in multi-agent learning.



In Differential Privacy, There is Truth: On Vote Leakage in Ensemble Private Learning Jiaqi Wang

Neural Information Processing Systems

When learning from sensitive data, care must be taken to ensure that training algorithms address privacy concerns. The canonical Private Aggregation of Teacher Ensembles, or P A TE, computes output labels by aggregating the predictions of a (possibly distributed) collection of teacher models via a voting mechanism. The mechanism adds noise to attain a differential privacy guarantee with respect to the teachers' training data. In this work, we observe that this use of noise, which makes P A TE predictions stochastic, enables new forms of leakage of sensitive information. For a given input, our adversary exploits this stochasticity to extract high-fidelity histograms of the votes submitted by the underlying teachers. From these histograms, the adversary can learn sensitive attributes of the input such as race, gender, or age. Although this attack does not directly violate the differential privacy guarantee, it clearly violates privacy norms and expectations, and would not be possible at all without the noise inserted to obtain differential privacy. In fact, counter-intuitively, the attack becomes easier as we add more noise to provide stronger differential privacy. We hope this encourages future work to consider privacy holistically rather than treat differential privacy as a panacea.


Test-Time Collective Prediction

Neural Information Processing Systems

An increasingly common setting in machine learning involves multiple parties, each with their own data, who want to jointly make predictions on future test points. Agents wish to benefit from the collective expertise of the full set of agents to make better predictions than they would individually, but may not be willing to release labeled data or model parameters.



Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

Neural Information Processing Systems

Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-improvement. To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. Specifically, FunCoder recursively branches off sub-functions as smaller goals during code generation, represented by a tree hierarchy. These sub-functions are then composited to attain more complex objectives. Additionally, we designate functions via a consensus formed by identifying similarities in program behavior, mitigating error propagation.


Distributed Low-rank Matrix Factorization With Exact Consensus

Neural Information Processing Systems

Low-rank matrix factorization is a problem of broad importance, owing to the ubiquity of low-rank models in machine learning contexts. In spite of its non-convexity, this problem has a well-behaved geometric landscape, permitting local search algorithms such as gradient descent to converge to global minimizers. In this paper, we study low-rank matrix factorization in the distributed setting, where local variables at each node encode parts of the overall matrix factors, and consensus is encouraged among certain such variables. We identify conditions under which this new problem also has a well-behaved geometric landscape, and we propose an extension of distributed gradient descent (DGD) to solve this problem. The favorable landscape allows us to prove convergence to global optimality with exact consensus, a stronger result than what is provided by off-the-shelf DGD theory.


TDC-Cache: A Trustworthy Decentralized Cooperative Caching Framework for Web3.0

Chen, Jinyu, Shi, Long, Wang, Taotao, Wang, Jiaheng, Zhang, Wei

arXiv.org Artificial Intelligence

Abstract--The rapid growth of Web3.0 is transforming the Internet from a centralized structure to decentralized, which empowers users with unprecedented self-sovereignty over their own data. However, in the context of decentralized data access within Web3.0, it is imperative to cope with efficiency concerns caused by the replication of redundant data, as well as security vulnerabilities caused by data inconsistency. T o address these challenges, we develop a Trustworthy Decentralized Cooperative Caching (TDC-Cache) framework for Web3.0 to ensure efficient caching and enhance system resilience against adversarial threats. This framework features a two-layer architecture, wherein the Decentralized Oracle Network (DON) layer serves as a trusted intermediary platform for decentralized caching, bridging the contents from decentralized storage and the content requests from users. In light of the complexity of Web3.0 network topologies and data flows, we propose a Deep Reinforcement Learning-Based Decentralized Caching (DRL-DC) for TDC-Cache to dynamically optimize caching strategies of distributed oracles. Furthermore, we develop a Proof of Cooperative Learning (PoCL) consensus to maintain the consistency of decentralized caching decisions within DON. Experimental results show that, compared with existing approaches, the proposed framework reduces average access latency by 20%, increases the cache hit rate by at most 18%, and improves the average success consensus rate by 10%. Overall, this paper serves as a first foray into the investigation of decentralized caching framework and strategy for Web3.0. HE rapid evolution of Web3.0 is driving the transition from traditional centralized systems to decentralized architectures. Leveraging blockchain, decentralized storage, and smart contracts, Web3.0 empowers users with unprecedented self-sovereignty over their own data through Decentralized Applications (DApps) [1].


Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning

Xing, Jingyu, Tang, Chenwei, Liu, Xinyu, Xiong, Deng, Huang, Shudong, Ju, Wei, Lv, Jiancheng, Qiao, Ziyue

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) has emerged as a powerful paradigm for advancing Large Language Models (LLMs), achieving remarkable performance in complex reasoning domains such as mathematics and code generation. However, current RL methods face a fundamental scalability bottleneck due to their heavy reliance on human-curated preference data or labeled datasets for reward modeling. To overcome this limitation, we explore RL on unlabeled data where models learn autonomously from continuous experience streams. The core challenge in this setting lies in reliable reward estimation without ground-truth supervision. Existing approaches like Test-Time RL address this through self-consistent consensus, but risk reinforcing incorrect pseudo-labels derived from majority voting. We introduce COMPASS (Composite Path and Answer Self-Scoring), a novel test-time reward mechanism that operates without external supervision. COMPASS integrates two complementary components: the Dual-Calibration Answer Reward (DCAR), which stabilizes training by establishing trustworthy pseudo-labels through confidence and credibility calibration, and the Decisive Path Reward (DPR), which directly optimizes the reasoning process quality beyond mere outcome supervision. By jointly reinforcing trustworthy consensus answers and highly decisive reasoning chains, the COMPASS systematically enhances the model's analytical capabilities. Extensive experiments show that COMPASS achieves significant and consistent performance gains across diverse reasoning tasks and model architectures, advancing a more scalable direction for LLMs to learn from continuous experience.


Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

Bashir, Adeela, han, The Anh, Shamszaman, Zia Ush

arXiv.org Artificial Intelligence

Abstract--The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also deployed as multi-agent teams to assist AI doctors by debating, voting, or advising on decisions. However, when multiple assistant agents interact, coordinated adversaries can collude to create false consensus, pushing an AI doctor toward harmful prescriptions. We develop an experimental framework with scripted and unscripted doctor agents, adversarial assistants, and a verifier agent that checks decisions against clinical guidelines. Using 50 representative clinical questions, we find that collusion drives the Attack Success Rate (ASR) and Harmful Recommendation Rates (HRR) up to 100% in unprotected systems. This work provides the first systematic evidence of collusion risk in AI healthcare and demonstrates a practical, lightweight defence that ensures guideline fidelity. Artificial intelligence (AI) is increasingly integrated into healthcare IoT systems, supporting tasks such as remote patient monitoring, diagnosis, and treatment recommendations. In this setting, ensuring the security and trustworthiness of AI decisions is critical, since medical errors caused by unsafe recommendations can severely harm patients [1]. However, AI doctors and LLM-based clinical decision agents face multiple vulnerabilities.