AITopics | split inference

Collaborating Authors

split inference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GAN You See Me? Enhanced Data Reconstruction Attacks against Split Inference

Neural Information Processing SystemsDec-26-2025, 13:02:37 GMT

Split Inference (SI) is an emerging deep learning paradigm that addresses computational constraints on edge devices and preserves data privacy through collaborative edge-cloud approaches. However, SI is vulnerable to Data Reconstruction Attacks (DRA), which aim to reconstruct users' private prediction instances. Existing attack methods suffer from various limitations. Optimization-based DRAs do not leverage public data effectively, while Learning-based DRAs depend heavily on auxiliary data quantity and distribution similarity. Consequently, these approaches yield unsatisfactory attack results and are sensitive to defense mechanisms. To overcome these challenges, we propose a GAN-based LAtent Space Search attack (GLASS) that harnesses abundant prior knowledge from public data using advanced StyleGAN technologies. Additionally, we introduce GLASS++ to enhance reconstruction stability. Our approach represents the first GAN-based DRA against SI, and extensive evaluation across different split points and adversary setups demonstrates its state-of-the-art performance. Moreover, we thoroughly examine seven defense mechanisms, highlighting our method's capability to reveal private information even in the presence of these defenses.

enhanced data reconstruction attack, name change, split inference, (4 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.78)
Information Technology > Security & Privacy (0.60)

Add feedback

Joint Partitioning and Placement of Foundation Models for Real-Time Edge AI

Djuhera, Aladin, Koch, Fernando, Binotto, Alecio

arXiv.org Artificial IntelligenceDec-2-2025

Static partitioning of model layers presumes temporal stability across compute and network resources, which is misaligned with the volatility of real-world deployments. We introduce a framework in which both the spatial placement and internal segmentation of foundation models are elevated to runtime-resolved constructs. The orchestration problem is formalized as a constrained optimization over layer-wise assignments, subject to evolving latency, utilization, and privacy gradients. The framework implements reactive inference composition responsive to infrastructural fluctuations by integrating model-aware capacity profiling with dynamic graph re-partitioning and reallocation. We introduce architectural and algorithmic components, along with a representative use case in 6G multi-access edge computing.

machine learning, natural language, node, (18 more...)

arXiv.org Artificial Intelligence

2512.01039

Country: Europe > Germany (0.28)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Natural Language (0.72)
(2 more...)

Add feedback

PrivDFS: Private Inference via Distributed Feature Sharing against Data Reconstruction Attacks

Liu, Zihan, Wen, Jiayi, Wu, Junru, Zou, Xuyang, Tan, Shouhong, Zheng, Zhirun, Huang, Cheng

arXiv.org Artificial IntelligenceNov-17-2025

In this paper, we introduce PrivDFS, a distributed feature-sharing framework for input-private inference in image classification. A single holistic intermediate representation in split inference gives diffusion-based Data Reconstruction Attacks (DRAs) sufficient signal to reconstruct the input with high fidelity. PrivDFS restructures this vulnerability by fragmenting the representation and processing the fragments independently across a majority-honest set of servers. As a result, each branch observes only an incomplete and reconstruction-insufficient view of the input. To realize this, PrivDFS employs learnable binary masks that partition the intermediate representation into sparse and largely non-overlapping feature shares, each processed by a separate server, while a lightweight fusion module aggregates their predictions on the client. This design preserves full task accuracy when all branches are combined, yet sharply limits the reconstructive power available to any individual server. PrivDFS applies seamlessly to both ResNet-based CNNs and Vision Transformers. Across CIFAR-10/100, CelebA, and ImageNet-1K, PrivDFS induces a pronounced collapse in DRA performance, e.g., on CIFAR-10, PSNR drops from 23.25 -> 12.72 and SSIM from 0.963 -> 0.260, while maintaining accuracy within 1% of non-private split inference. These results establish structural feature partitioning as a practical and architecture-agnostic approach to reducing reconstructive leakage in cloud-based vision inference.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2508.04346

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Intelligent Orchestration of Distributed Large Foundation Model Inference at the Edge

Koch, Fernando, Djuhera, Aladin, Binotto, Alecio

arXiv.org Artificial IntelligenceNov-12-2025

Large Foundation Models (LFMs), including multi-modal and generative models, promise to unlock new capabilities for next-generation Edge AI applications. However, performing inference with LFMs in resource-constrained and heterogeneous edge environments, such as Multi-access Edge Computing (MEC), presents significant challenges for workload orchestration due to time-varying network, compute, and storage conditions. In particular, current split inference strategies, which partition LFM layers across nodes, are not designed to adapt to fluctuating workloads, dynamic bandwidth conditions, or evolving privacy constraints in high-utilization MEC environments. In this work, we propose a novel adaptive split inference orchestration framework that elevates both the placement and partitioning of LFM layers to runtime-tunable variables. Specifically, our framework enables real-time, quality-of-service (QoS)-aware management of inference workloads by extending conventional orchestrators with three key services: (1) Capacity-aware workload distribution, which continuously profiles node resources and selects an optimal subset of MEC nodes; (2) Dynamic partition migration, which transparently relocates pre-cut LFM segments in response to changes in utilization or network conditions; (3) Real-time reconfiguration, which dynamically re-splits LFM layers to balance latency, throughput, and privacy. We formalize the joint placement-partitioning problem, outline a reference architecture and algorithmic workflow, and discuss applicability in representative smart city, V2X, and industrial edge scenarios.

machine learning, node, real time system, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.37256/cnc.3220256807

2504.03668

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.69)
Transportation (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

GAN You See Me? Enhanced Data Reconstruction Attacks against Split Inference

Neural Information Processing SystemsJan-19-2025, 18:36:36 GMT

data reconstruction attack, enhanced data reconstruction attack, split inference, (3 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.83)
Information Technology > Security & Privacy (0.64)

Add feedback

Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing

Malekzadeh, Mohammad, Kawsar, Fahim

arXiv.org Artificial IntelligenceJan-19-2024

In split inference, a deep neural network (DNN) is partitioned to run the early part of the DNN at the edge and the later part of the DNN in the cloud. This meets two key requirements for on-device machine learning: input privacy and computation efficiency. Still, an open question in split inference is output privacy, given that the outputs of the DNN are observable in the cloud. While encrypted computing can protect output privacy too, homomorphic encryption requires substantial computation and communication resources from both edge and cloud devices. In this paper, we introduce Salted DNNs: a novel approach that enables clients at the edge, who run the early part of the DNN, to control the semantic interpretation of the DNN's outputs at inference time. Our proposed Salted DNNs maintain classification accuracy and computation efficiency very close to the standard DNN counterparts. Experimental evaluations conducted on both images and wearable sensor data demonstrate that Salted DNNs attain classification accuracy very close to standard DNNs, particularly when the Salted Layer is positioned within the early part to meet the requirements of split inference. Our approach is general and can be applied to various types of DNNs. As a benchmark for future studies, we open-source our code.

dnn, inference, split inference, (14 more...)

arXiv.org Artificial Intelligence

2310.13384

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Energy-Efficient Power Control for Multiple-Task Split Inference in UAVs: A Tiny Learning-Based Approach

Zhao, Chenxi, Sheng, Min, Liu, Junyu, Chu, Tianshu, Li, Jiandong

arXiv.org Artificial IntelligenceDec-31-2023

The limited energy and computing resources of unmanned aerial vehicles (UAVs) hinder the application of aerial artificial intelligence. The utilization of split inference in UAVs garners significant attention due to its effectiveness in mitigating computing and energy requirements. However, achieving energy-efficient split inference in UAVs remains complex considering of various crucial parameters such as energy level and delay constraints, especially involving multiple tasks. In this paper, we present a two-timescale approach for energy minimization in split inference, where discrete and continuous variables are segregated into two timescales to reduce the size of action space and computational complexity. This segregation enables the utilization of tiny reinforcement learning (TRL) for selecting discrete transmission modes for sequential tasks. Moreover, optimization programming (OP) is embedded between TRL's output and reward function to optimize the continuous transmit power. Specifically, we replace the optimization of transmit power with that of transmission time to decrease the computational complexity of OP since we reveal that energy consumption monotonically decreases with increasing transmission time. The replacement significantly reduces the feasible region and enables a fast solution according to the closed-form expression for optimal transmit power. Simulation results show that the proposed algorithm can achieve a higher probability of successful task completion with lower energy consumption.

constraint, time slot, transmit power, (15 more...)

arXiv.org Artificial Intelligence

2401.00445

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Renewable > Solar (0.47)
Information Technology > Security & Privacy (0.46)
Law > Statutes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
(2 more...)

Add feedback

Bounding the Invertibility of Privacy-preserving Instance Encoding using Fisher Information

Maeng, Kiwan, Guo, Chuan, Kariyappa, Sanjay, Suh, G. Edward

arXiv.org Artificial IntelligenceMay-6-2023

Privacy-preserving instance encoding aims to encode raw data as feature vectors without revealing their privacy-sensitive information. When designed properly, these encodings can be used for downstream ML applications such as training and inference with limited privacy risk. However, the vast majority of existing instance encoding schemes are based on heuristics and their privacy-preserving properties are only validated empirically against a limited set of attacks. In this paper, we propose a theoretically-principled measure for the privacy of instance encoding based on Fisher information. We show that our privacy measure is intuitive, easily applicable, and can be used to bound the invertibility of encodings both theoretically and empirically.

data mining, dfil, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2305.04146

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(13 more...)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.82)

Add feedback