Goto

Collaborating Authors

 resource budget


Trading Off Resource Budgets For Improved Regret Bounds

Neural Information Processing Systems

In this work we consider a variant of adversarial online learning where in each round one picks $B$ out of $N$ arms and incurs cost equal to the $\textit{minimum}$ of the costs of each arm chosen. We propose an algorithm called Follow the Perturbed Multiple Leaders (FPML) for this problem, which we show (by adapting the techniques of Kalai and Vempala [2005]) achieves expected regret $\mathcal{O}(T^{\frac{1}{B+1}}\ln(N)^{\frac{B}{B+1}})$ over time horizon $T$ relative to the $\textit{single}$ best arm in hindsight. This introduces a trade-off between the budget $B$ and the single-best-arm regret, and we proceed to investigate several applications of this trade-off. First, we observe that algorithms which use standard regret minimizers as subroutines can sometimes be adapted by replacing these subroutines with FPML, and we use this to generalize existing algorithms for Online Submodular Function Maximization [Streeter and Golovin, 2008] in both the full feedback and semi-bandit feedback settings. Next, we empirically evaluate our new algorithms on an online black-box hyperparameter optimization problem. Finally, we show how FPML can lead to new algorithms for Linear Programming which require stronger oracles at the benefit of fewer oracle calls.


FLAME: Towards Federated Fine-Tuning Large Language Models Through Adaptive SMoE

Le, Khiem, Tran, Tuan, Hua, Ting, Chawla, Nitesh V.

arXiv.org Artificial Intelligence

Existing resource-adaptive LoRA federated fine-tuning methods enable clients to fine-tune models using compressed versions of global LoRA matrices, in order to accommodate various compute resources across clients. This compression requirement will lead to suboptimal performance due to information loss. To address this, we propose FLAME, a novel federated learning framework based on the Sparse Mixture-of-Experts (SMoE) architecture. Unlike prior approaches, FLAME retains full (uncompressed) global LoRA matrices and achieves client-side adaptability by varying the number of activated experts per client. However, incorporating SMoE into federated learning introduces unique challenges, specifically, the mismatch in output magnitude from partial expert activation and the imbalance in expert training quality across clients. FLAME tackles these challenges through a lightweight rescaling mechanism and an activation-aware aggregation scheme. Empirical results across diverse computational settings demonstrate that FLAME consistently outperforms existing methods, providing a robust and effective solution for resource-adaptive federated learning.


Resource Constrained Pathfinding with Enhanced Bidirectional A* Search

Ahmadi, Saman, Raith, Andrea, Tack, Guido, Jalili, Mahdi

arXiv.org Artificial Intelligence

The classic Resource Constrained Shortest Path (RCSP) problem aims to find a cost optimal path between a pair of nodes in a network such that the resources used in the path are within a given limit. Having been studied for over a decade, RCSP has seen recent solutions that utilize heuristic-guided search to solve the constrained problem faster. Building upon the bidirectional A* search paradigm, this research introduces a novel constrained search framework that uses efficient pruning strategies to allow for accelerated and effective RCSP search in large-scale networks. Results show that, compared to the state of the art, our enhanced framework can significantly reduce the constrained search time, achieving speed-ups of over to two orders of magnitude.


Trading Off Resource Budgets For Improved Regret Bounds

Neural Information Processing Systems

In this work we consider a variant of adversarial online learning where in each round one picks B out of N arms and incurs cost equal to the \textit{minimum} of the costs of each arm chosen. We propose an algorithm called Follow the Perturbed Multiple Leaders (FPML) for this problem, which we show (by adapting the techniques of Kalai and Vempala [2005]) achieves expected regret \mathcal{O}(T {\frac{1}{B 1}}\ln(N) {\frac{B}{B 1}}) over time horizon T relative to the \textit{single} best arm in hindsight. This introduces a trade-off between the budget B and the single-best-arm regret, and we proceed to investigate several applications of this trade-off. First, we observe that algorithms which use standard regret minimizers as subroutines can sometimes be adapted by replacing these subroutines with FPML, and we use this to generalize existing algorithms for Online Submodular Function Maximization [Streeter and Golovin, 2008] in both the full feedback and semi-bandit feedback settings. Next, we empirically evaluate our new algorithms on an online black-box hyperparameter optimization problem.


RHFedMTL: Resource-Aware Hierarchical Federated Multi-Task Learning

Yi, Xingfu, Li, Rongpeng, Peng, Chenghui, Wang, Fei, Wu, Jianjun, Zhao, Zhifeng

arXiv.org Artificial Intelligence

The rapid development of artificial intelligence (AI) over massive applications including Internet-of-things on cellular network raises the concern of technical challenges such as privacy, heterogeneity and resource efficiency. Federated learning is an effective way to enable AI over massive distributed nodes with security. However, conventional works mostly focus on learning a single global model for a unique task across the network, and are generally less competent to handle multi-task learning (MTL) scenarios with stragglers at the expense of acceptable computation and communication cost. Meanwhile, it is challenging to ensure the privacy while maintain a coupled multi-task learning across multiple base stations (BSs) and terminals. In this paper, inspired by the natural cloud-BS-terminal hierarchy of cellular works, we provide a viable resource-aware hierarchical federated MTL (RHFedMTL) solution to meet the heterogeneity of tasks, by solving different tasks within the BSs and aggregating the multi-task result in the cloud without compromising the privacy. Specifically, a primal-dual method has been leveraged to effectively transform the coupled MTL into some local optimization sub-problems within BSs. Furthermore, compared with existing methods to reduce resource cost by simply changing the aggregation frequency, we dive into the intricate relationship between resource consumption and learning accuracy, and develop a resource-aware learning strategy for local terminals and BSs to meet the resource budget. Extensive simulation results demonstrate the effectiveness and superiority of RHFedMTL in terms of improving the learning accuracy and boosting the convergence rate.


Strategically revealing capabilities in General Lotto games

Paarporn, Keith, Brown, Philip N.

arXiv.org Artificial Intelligence

In this paper, we address this question in the context of General Lotto games, a class of two-player competitive resource allocation models. We consider an asymmetric information setting where the opponent is uncertain about the resource budget of the other player, and holds a prior belief on its value. We assume the other player, called the signaler, is able to send a noisy signal about its budget to the opponent. With its updated belief, the opponent then must decide to invest in costly resources that it will deploy against the signaler's resource budget in a General Lotto game. We derive the subgame perfect equilibrium to this extensive-form game. In particular, we identify necessary and sufficient conditions for which a signaling policy improves the signaler's resulting performance in comparison to the scenario where it does not send any signal. Moreover, we provide the optimal signaling policy when these conditions are met. Notably we find that for some scenarios, the signaler can effectively double its performance.


A Feature-map Discriminant Perspective for Pruning Deep Neural Networks

Hou, Zejiang, Kung, Sun-Yuan

arXiv.org Machine Learning

Network pruning has become the de facto tool to accelerate deep neural networks for mobile and edge applications. Recently, feature-map discriminant based channel pruning has shown promising results, as it aligns well with the CNN objective of differentiating multiple classes and offers better interpretability of the pruning decision. However, existing discriminant-based methods are challenged by computation inefficiency, as there is a lack of theoretical guidance on quantifying the feature-map discriminant power. In this paper, we present a new mathematical formulation to accurately and efficiently quantify the feature-map discriminativeness, which gives rise to a novel criterion,Discriminant Information(DI). We analyze the theoretical property of DI, specifically the non-decreasing property, that makes DI a valid selection criterion. DI-based pruning removes channels with minimum influence to DI value, as they contain little information regarding to the discriminant power. The versatility of DI criterion also enables an intra-layer mixed precision quantization to further compress the network. Moreover, we propose a DI-based greedy pruning algorithm and structure distillation technique to automatically decide the pruned structure that satisfies certain resource budget, which is a common requirement in reality. Extensive experiments demonstratethe effectiveness of our method: our pruned ResNet50 on ImageNet achieves 44% FLOPs reduction without any Top-1 accuracy loss compared to unpruned model


Differentially Private Federated Learning for Resource-Constrained Internet of Things

Hu, Rui, Guo, Yuanxiong, Ratazzi, E. Paul., Gong, Yanmin

arXiv.org Machine Learning

With the proliferation of smart devices having built-in sensors, Internet connectivity, and programmable computation capability in the era of Internet of things (IoT), tremendous data is being generated at the network edge. Federated learning is capable of analyzing the large amount of data from a distributed set of smart devices without requiring them to upload their data to a central place. However, the commonly-used federated learning algorithm is based on stochastic gradient descent (SGD) and not suitable for resource-constrained IoT environments due to its high communication resource requirement. Moreover, the privacy of sensitive data on smart devices has become a key concern and needs to be protected rigorously. This paper proposes a novel federated learning framework called DP-PASGD for training a machine learning model efficiently from the data stored across resource-constrained smart devices in IoT while guaranteeing differential privacy. The optimal schematic design of DP-PASGD that maximizes the learning performance while satisfying the limits on resource cost and privacy loss is formulated as an optimization problem, and an approximate solution method based on the convergence analysis of DP-PASGD is developed to solve the optimization problem efficiently. Numerical results based on real-world datasets verify the effectiveness of the proposed DP-PASGD scheme.


CascadeCNN: Pushing the performance limits of quantisation

Kouris, Alexandros, Venieris, Stylianos I., Bouganis, Christos-Savvas

arXiv.org Artificial Intelligence

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, to perform high-throughput inference by exploiting the computation time-accuracy trade-off. Without the need for retraining, a two-stage architecture tailored for any given FPGA device is generated, consisting of a low- and a high-precision unit. A confidence evaluation unit is employed between them to identify misclassified cases at run time and forward them to the high-precision unit or terminate computation. Experiments demonstrate that CascadeCNN achieves a performance boost of up to 55% for VGG-16 and 48% for AlexNet over the baseline design for the same resource budget and accuracy.