training latency
- North America > United States > Indiana (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Glyph: Fast and Accurately Training Deep Neural Networks on Encrypted Data
Because of the lack of expertise, to gain benefits from their data, average users have to upload their private data to cloud servers they may not trust. Due to legal or privacy constraints, most users are willing to contribute only their encrypted data, and lack interests or resources to join deep neural network (DNN) training in cloud.
- North America > United States > Indiana (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
We thank the reviewers for their careful reading of the manuscript and their constructive suggestions
We thank the reviewers for their careful reading of the manuscript and their constructive suggestions. Chimera supports the switching between BFV and TFHE, while Glyph enables the switching between BGV and TFHE. Some users may not have such large network bandwidth. In contrast, Glyph first trains a CNN network model by a plaintext public dataset. Except sending the encrypted input data, the training of Glyph does not involve the client.
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
Wang, Xujia, Qi, Yunjia, Xu, Bin
Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA, significantly reduce the number of trainable parameters by introducing low-rank decomposition matrices. However, existing methods perform extensive matrix multiplications in domain specialization tasks, resulting in computational inefficiency and sub-optimal fine-tuning performance. Hence, we propose LoSiA(Low-Resources Subnet Integration Adaptation), an innovative method that dynamically localizes and optimizes critical parameters during the training process. Specifically, it identifies a sub-network using gradient sparsity analysis and optimizes it as the trainable target. This design enables effective high-rank adaptation by updating only the sub-network parameters, reducing the additional matrix multiplication. We also present LoSiA-Pro, a faster implementation of LoSiA, which reduces the training latency by about $27\%$ compared to LoRA. Extensive evaluations show that our method achieves minimal performance drop compared to full fine-tuning, while requiring the least training time across domain specialization and common-sense reasoning tasks. Further analysis shows that LoSiA also reduces forgetting during continued training. The source code is available at https://github.com/KlozeWang/LoSiA.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
SALT: A Lightweight Model Adaptation Method for Closed Split Computing Environments
--We propose SAL T (Split-Adaptive Lightweight T un-ing), a lightweight model adaptation framework for Split Computing under closed constraints, where the head and tail networks are proprietary and inaccessible to users. In such closed environments, conventional adaptation methods are infeasible since they require access to model parameters or architectures. SAL T addresses this challenge by introducing a compact, trainable adapter on the client side to refine latent features from the head network, enabling user-specific adaptation without modifying the original models or increasing communication overhead. We evaluate SAL T on user-specific classification tasks with CIF AR-10 and CIF AR-100, demonstrating improved accuracy with lower training latency compared to fine-tuning methods. With minimal deployment overhead, SAL T offers a practical solution for personalized inference in edge AI systems under strict system constraints. The increasing scale of deep learning models deployed in cloud-based AI services has raised concerns regarding server-side computational load and inference latency. To address these challenges, Split Computing has emerged as a promising paradigm that offloads part of a large cloud-based model to the client device [1], [2]. In this architecture, the neural network model is partitioned into a head network executed on the client and a tail network retained on the cloud.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Africa > South Africa > Western Cape > Cape Town (0.04)
HASFL: Heterogeneity-aware Split Federated Learning over Edge Computing Systems
Lin, Zheng, Chen, Zhe, Chen, Xianhao, Ni, Wei, Gao, Yue
--Split federated learning (SFL) has emerged as a promising paradigm to democratize machine learning (ML) on edge devices by enabling layer-wise model partitioning. However, existing SFL approaches suffer significantly from the straggler effect due to the heterogeneous capabilities of edge devices. T o address the fundamental challenge, we propose adaptively controlling batch sizes (BSs) and model splitting (MS) for edge devices to overcome resource heterogeneity. We first derive a tight convergence bound of SFL that quantifies the impact of varied BSs and MS on learning performance. Based on the convergence bound, we propose HASFL, a heterogeneity-aware SFL framework capable of adaptively controlling BS and MS to balance communication-computing latency and training convergence in heterogeneous edge networks. Extensive experiments with various datasets validate the effectiveness of HASFL and demonstrate its superiority over state-of-the-art benchmarks. Conventional machine learning (ML) frameworks predominantly rely on centralized learning (CL), where raw data is gathered and processed at a central server for model training. However, CL is often impractical due to its high communication latency, increased backbone traffic, and privacy risks [1]-[4]. To address these limitations, federated learning (FL) [5], [6] has emerged as a promising alternative that allows participating devices to collaboratively train a shared model via exchanging model parameters (e.g., gradients) rather than raw data, thereby protecting data privacy and reducing communication costs [7], [8]. Despite its advantage, on-device training of FL poses a significant challenge for its deployment on resource-constrained edge devices as ML models scale up [9], [10].
- Asia > China > Shanghai > Shanghai (0.04)
- Asia > China > Hong Kong (0.04)
- Oceania > Australia > New South Wales > Kensington (0.04)
- Asia > Middle East > Jordan (0.04)
Split Federated Learning Over Heterogeneous Edge Devices: Algorithm and Optimization
Sun, Yunrui, Hu, Gang, Teng, Yinglei, Cai, Dunbo
Split Learning (SL) is a promising collaborative machine learning approach, enabling resource-constrained devices to train models without sharing raw data, while reducing computational load and preserving privacy simultaneously. However, current SL algorithms face limitations in training efficiency and suffer from prolonged latency, particularly in sequential settings, where the slowest device can bottleneck the entire process due to heterogeneous resources and frequent data exchanges between clients and servers. To address these challenges, we propose the Heterogeneous Split Federated Learning (HSFL) framework, which allows resource-constrained clients to train their personalized client-side models in parallel, utilizing different cut layers. Aiming to mitigate the impact of heterogeneous environments and accelerate the training process, we formulate a latency minimization problem that optimizes computational and transmission resources jointly. Additionally, we design a resource allocation algorithm that combines the Sample Average Approximation (SAA), Genetic Algorithm (GA), Lagrangian relaxation and Branch and Bound (B\&B) methods to efficiently solve this problem. Simulation results demonstrate that HSFL outperforms other frameworks in terms of both convergence rate and model accuracy on heterogeneous devices with non-iid data, while the optimization algorithm is better than other baseline methods in reducing latency.
Glyph: Fast and Accurately Training Deep Neural Networks on Encrypted Data
Because of the lack of expertise, to gain benefits from their data, average users have to upload their private data to cloud servers they may not trust. Due to legal or privacy constraints, most users are willing to contribute only their encrypted data, and lack interests or resources to join deep neural network (DNN) training in cloud. However, such inefficient lookup-table-based activations significantly prolong private training latency of DNNs. In this paper, we propose, Glyph, an FHE-based technique to fast and accurately train DNNs on encrypted data by switching between TFHE (Fast Fully Homomorphic Encryption over the Torus) and BGV cryptosystems. Glyph uses logic-operation-friendly TFHE to implement nonlinear activations, while adopts vectorial-arithmetic-friendly BGV to perform multiply-accumulations (MACs). Glyph further applies transfer learning on DNN training to improve test accuracy and reduce the number of MACs between ciphertext and ciphertext in convolutional layers.
Heterogeneity-Aware Resource Allocation and Topology Design for Hierarchical Federated Edge Learning
Gao, Zhidong, Zhang, Yu, Gong, Yanmin, Guo, Yuanxiong
Federated Learning (FL) provides a privacy-preserving framework for training machine learning models on mobile edge devices. Traditional FL algorithms, e.g., FedAvg, impose a heavy communication workload on these devices. To mitigate this issue, Hierarchical Federated Edge Learning (HFEL) has been proposed, leveraging edge servers as intermediaries for model aggregation. Despite its effectiveness, HFEL encounters challenges such as a slow convergence rate and high resource consumption, particularly in the presence of system and data heterogeneity. However, existing works are mainly focused on improving training efficiency for traditional FL, leaving the efficiency of HFEL largely unexplored. In this paper, we consider a two-tier HFEL system, where edge devices are connected to edge servers and edge servers are interconnected through peer-to-peer (P2P) edge backhauls. Our goal is to enhance the training efficiency of the HFEL system through strategic resource allocation and topology design. Specifically, we formulate an optimization problem to minimize the total training latency by allocating the computation and communication resources, as well as adjusting the P2P connections. To ensure convergence under dynamic topologies, we analyze the convergence error bound and introduce a model consensus constraint into the optimization problem. The proposed problem is then decomposed into several subproblems, enabling us to alternatively solve it online. Our method facilitates the efficient implementation of large-scale FL at edge networks under data and system heterogeneity. Comprehensive experiment evaluation on benchmark datasets validates the effectiveness of the proposed method, demonstrating significant reductions in training latency while maintaining the model accuracy compared to various baselines.
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)
- Information Technology > Security & Privacy (1.00)
- Telecommunications (0.93)