AITopics | collaborative inference

Collaborating Authors

collaborative inference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reimagining Mutual Information for Enhanced Defense against Data Leakage in Collaborative Inference

Neural Information Processing SystemsMar-20-2026, 11:51:51 GMT

Edge-cloud collaborative inference empowers resource-limited IoT devices to support deep learning applications without disclosing their raw data to the cloud server, thus protecting user's data. Nevertheless, prior research has shown that collaborative inference still results in the exposure of input and predictions from edge devices. To defend against such data leakage in collaborative inference, we introduce InfoScissors, a defense strategy designed to reduce the mutual information between a model's intermediate outcomes and the device's input and predictions. We evaluate our defense on several datasets in the context of diverse attacks. Besides the empirical comparison, we provide a theoretical analysis of the inadequacies of recent defense strategies that also utilize mutual information, particularly focusing on those based on the Variational Information Bottleneck (VIB) approach. We illustrate the superiority of our method and offer a theoretical analysis of it.

artificial intelligence, cloud computing, machine learning, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Cloud Computing (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Supplementary Material: Posthoc privacy guarantees for collaborative inference with modified Propose-T est-Release Abhishek Singh

Neural Information Processing SystemsFeb-19-2026, 11:58:50 GMT

Lemma A.2. Algorithm ϕ gives a lower bound on the query γ . That is, ϕ (x, R) γ (x, R).

artificial intelligence, machine learning, sensitivity, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

4eb32e1569085c8f8883163665bf3c0a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 21:22:57 GMT

artificial intelligence, leakage, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Greece (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Posthoc privacy guarantees for collaborative inference with modified Propose-Test-Release

Neural Information Processing SystemsDec-25-2025, 07:13:17 GMT

Cloud-based machine learning inference is an emerging paradigm where users query by sending their data through a service provider who runs an ML model on that data and returns back the answer. Due to increased concerns over data privacy, recent works have proposed Collaborative Inference (CI) to learn a privacy-preserving encoding of sensitive user data before it is shared with an untrusted service provider. Existing works so far evaluate the privacy of these encodings through empirical reconstruction attacks. In this work, we develop a new framework that provides formal privacy guarantees for an arbitrarily trained neural network by linking its local Lipschitz constant with its local sensitivity. To guarantee privacy using local sensitivity, we extend the Propose-Test-Release (PTR) framework to make it tractable for neural network queries. We verify the efficacy of our framework experimentally on real-world datasets and elucidate the role of Adversarial Representation Learning (ARL) in improving the privacy-utility trade-off.

collaborative inference, posthoc privacy guarantee, propose-test-release, (5 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.98)

Technology:

Information Technology > Security & Privacy (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)

Add feedback

Action Deviation-Aware Inference for Low-Latency Wireless Robots

Park, Jeyoung, Lim, Yeonsub, Oh, Seungeun, Park, Jihong, Choi, Jinho, Kim, Seong-Lyun

arXiv.org Artificial IntelligenceNov-7-2025

To support latency-sensitive AI applications ranging from autonomous driving to industrial robot manipulation, 6G envisions distributed ML with computational resources in mobile, edge, and cloud connected over hyper-reliable low-latency communication (HRLLC). In this setting, speculative decoding can facilitate collaborative inference of models distributively deployed: a lightweight on-device model locally generates drafts while a more capable remote target model on a server verifies and corrects them in parallel with speculative sampling, thus resulting in lower latency without compromising accuracy. However, unlike autoregressive text generation, behavior cloning policies, typically used for embodied AI applications, cannot parallelize verification and correction for multiple drafts as each generated action depends on observation updated by a previous action. To this end, we propose Action Deviation-Aware Hybrid Inference (ADAHI), wherein drafts are selectively transmitted and verified based on action deviation, which has a strong correlation with action's rejection probability by the target model. By invoking server operation only when necessary, communication and computational overhead can be reduced while accuracy gain from speculative sampling is preserved. Experiments on our testbed show that ADAHI reduces transmission and server operations by approximately 40%, lowers end-to-end latency by 39.2%, and attains up to 97.2% of the task-success rate of baseline that invokes speculative sampling for every draft embedding vector.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.02851

Genre: Research Report (0.50)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

4eb32e1569085c8f8883163665bf3c0a-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 02:00:18 GMT

information, leakage, server, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Greece (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Supplementary Material: Posthoc privacy guarantees for collaborative inference with modified Propose-T est-Release Abhishek Singh

Neural Information Processing SystemsOct-8-2025, 17:17:29 GMT

Lemma A.2. Algorithm ϕ gives a lower bound on the query γ . That is, ϕ (x, R) γ (x, R).

artificial intelligence, machine learning, sensitivity, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference

Xu, Guanyu, Hao, Zhiwei, Shen, Li, Luo, Yong, Sun, Fuhui, Wang, Xiaoyan, Hu, Han, Wen, Yonggang

arXiv.org Artificial IntelligenceAug-29-2025

--The impressive performance of transformer models has sparked the deployment of intelligent applications on resource-constrained edge devices. However, ensuring high-quality service for real-time edge systems is a significant challenge due to the considerable computational demands and resource requirements of these models. Existing strategies typically either offload transformer computations to other devices or directly deploy compressed models on individual edge devices. T o tackle these challenges, we propose a collaborative inference system for general transformer models, termed CoFormer . The central idea behind CoFormer is to exploit the divisibility and integrability of transformer . An off-the-shelf large transformer can be decomposed into multiple smaller models for distributed inference, and their intermediate results are aggregated to generate the final output. We formulate an optimization problem to minimize both inference latency and accuracy degradation under heterogeneous hardware constraints. DeBo algorithm is proposed to first solve the optimization problem to derive the decomposition policy, and then progressively calibrate decomposed models to restore performance. We demonstrate the capability to support a wide range of transformer models on heterogeneous edge devices, achieving up to 3.1 inference speedup with large transformer models. Notably, CoFormer enables the efficient inference of GPT2-XL with 1.6 billion parameters on edge devices, reducing memory requirements by 76.3%. CoFormer can also reduce energy consumption by approximately 40% while maintaining satisfactory inference performance. Guanyu Xu, Zhiwei Hao and Han Hu are with the School of Information and Electrionics, Beijing Institute of Technology, Beijing 100081, China. Li Shen is with the School of Cyber Science and Technology, Shen-zhen Campus of Sun Y at-sen University, Shenzhen 518107, China. Y ong Luo is with the School of Computer Science, National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan 430072, China. Fuhui Sun and Xiaoyan Wang are with Information Technology Service Center of People's Court, Beijing, 100745, China. Y onggang Wen is with the College of Computing and Data Science, Nanyang Technological University, Singapore 639798. CoFormer significantly outperforms other methods. Specifically, CoFormer accelerates inference speed by 3.1 compared to Swin-L [4] with only 1.7% accuracy sacrifice.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.20375

Country:

Asia > China > Beijing > Beijing (0.64)
Asia > China > Hubei Province > Wuhan (0.44)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.44)
Health & Medicine > Therapeutic Area > Immunology (0.44)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

CoPur: Certifiably Robust Collaborative Inference via Feature Purification

Neural Information Processing SystemsAug-17-2025, 13:04:45 GMT

In this setting, we consider inference phase attacks when a small fraction of agents is compromised. The compromised agent either does not send embedded features to the FC or sends arbitrary embedded features.

agent, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: