Gursoy, M. Cenk
Feature-based Federated Transfer Learning: Communication Efficiency, Robustness and Privacy
Wang, Feng, Gursoy, M. Cenk, Velipasalar, Senem
In this paper, we propose feature-based federated transfer learning as a novel approach to improve communication efficiency by reducing the uplink payload by multiple orders of magnitude compared to that of existing approaches in federated learning and federated transfer learning. Specifically, in the proposed feature-based federated learning, we design the extracted features and outputs to be uploaded instead of parameter updates. For this distributed learning model, we determine the required payload and provide comparisons with the existing schemes. Subsequently, we analyze the robustness of feature-based federated transfer learning against packet loss, data insufficiency, and quantization. Finally, we address privacy considerations by defining and analyzing label privacy leakage and feature privacy leakage, and investigating mitigating approaches. For all aforementioned analyses, we evaluate the performance of the proposed learning scheme via experiments on an image classification task and a natural language processing task to demonstrate its effectiveness.
QMGeo: Differentially Private Federated Learning via Stochastic Quantization with Mixed Truncated Geometric Distribution
Wang, Zixi, Gursoy, M. Cenk
Federated learning (FL) is a framework which allows multiple users to jointly train a global machine learning (ML) model by transmitting only model updates under the coordination of a parameter server, while being able to keep their datasets local. One key motivation of such distributed frameworks is to provide privacy guarantees to the users. However, preserving the users' datasets locally is shown to be not sufficient for privacy. Several differential privacy (DP) mechanisms have been proposed to provide provable privacy guarantees by introducing randomness into the framework, and majority of these mechanisms rely on injecting additive noise. FL frameworks also face the challenge of communication efficiency, especially as machine learning models grow in complexity and size. Quantization is a commonly utilized method, reducing the communication cost by transmitting compressed representation of the underlying information. Although there have been several studies on DP and quantization in FL, the potential contribution of the quantization method alone in providing privacy guarantees has not been extensively analyzed yet. We in this paper present a novel stochastic quantization method, utilizing a mixed geometric distribution to introduce the randomness needed to provide DP, without any additive noise. We provide convergence analysis for our framework and empirically study its performance.
Anomaly Detection via Learning-Based Sequential Controlled Sensing
Joseph, Geethu, Zhong, Chen, Gursoy, M. Cenk, Velipasalar, Senem, Varshney, Pramod K.
In this paper, we address the problem of detecting anomalies among a given set of binary processes via learning-based controlled sensing. Each process is parameterized by a binary random variable indicating whether the process is anomalous. To identify the anomalies, the decision-making agent is allowed to observe a subset of the processes at each time instant. Also, probing each process has an associated cost. Our objective is to design a sequential selection policy that dynamically determines which processes to observe at each time with the goal to minimize the delay in making the decision and the total sensing cost. We cast this problem as a sequential hypothesis testing problem within the framework of Markov decision processes. This formulation utilizes both a Bayesian log-likelihood ratio-based reward and an entropy-based reward. The problem is then solved using two approaches: 1) a deep reinforcement learning-based approach where we design both deep Q-learning and policy gradient actor-critic algorithms; and 2) a deep active inference-based approach. Using numerical experiments, we demonstrate the efficacy of our algorithms and show that our algorithms adapt to any unknown statistical dependence pattern of the processes.
Robust Network Slicing: Multi-Agent Policies, Adversarial Attacks, and Defensive Strategies
Wang, Feng, Gursoy, M. Cenk, Velipasalar, Senem
In this paper, we present a multi-agent deep reinforcement learning (deep RL) framework for network slicing in a dynamic environment with multiple base stations and multiple users. In particular, we propose a novel deep RL framework with multiple actors and centralized critic (MACC) in which actors are implemented as pointer networks to fit the varying dimension of input. We evaluate the performance of the proposed deep RL algorithm via simulations to demonstrate its effectiveness. Subsequently, we develop a deep RL based jammer with limited prior information and limited power budget. The goal of the jammer is to minimize the transmission rates achieved with network slicing and thus degrade the network slicing agents' performance. We design a jammer with both listening and jamming phases and address jamming location optimization as well as jamming channel optimization via deep RL. We evaluate the jammer at the optimized location, generating interference attacks in the optimized set of channels by switching between the jamming phase and listening phase. We show that the proposed jammer can significantly reduce the victims' performance without direct feedback or prior knowledge on the network slicing policies. Finally, we devise a Nash-equilibrium-supervised policy ensemble mixed strategy profile for network slicing (as a defensive measure) and jamming. We evaluate the performance of the proposed policy ensemble algorithm by applying on the network slicing agents and the jammer agent in simulations to show its effectiveness.
Maximum Knowledge Orthogonality Reconstruction with Gradients in Federated Learning
Wang, Feng, Velipasalar, Senem, Gursoy, M. Cenk
Federated learning (FL) aims at keeping client data local to preserve privacy. Instead of gathering the data itself, the server only collects aggregated gradient updates from clients. Following the popularity of FL, there has been considerable amount of work, revealing the vulnerability of FL approaches by reconstructing the input data from gradient updates. Yet, most existing works assume an FL setting with unrealistically small batch size, and have poor image quality when the batch size is large. Other works modify the neural network architectures or parameters to the point of being suspicious, and thus, can be detected by clients. Moreover, most of them can only reconstruct one sample input from a large batch. To address these limitations, we propose a novel and completely analytical approach, referred to as the maximum knowledge orthogonality reconstruction (MKOR), to reconstruct clients' input data. Our proposed method reconstructs a mathematically proven high quality image from large batches. MKOR only requires the server to send secretly modified parameters to clients and can efficiently and inconspicuously reconstruct the input images from clients' gradient updates. We evaluate MKOR's performance on the MNIST, CIFAR-100, and ImageNet dataset and compare it with the state-of-the-art works. The results show that MKOR outperforms the existing approaches, and draws attention to a pressing need for further research on the privacy protection of FL so that comprehensive defense approaches can be developed.
Temporal Detection of Anomalies via Actor-Critic Based Controlled Sensing
Joseph, Geethu, Gursoy, M. Cenk, Varshney, Pramod K.
We address the problem of monitoring a set of binary stochastic processes and generating an alert when the number of anomalies among them exceeds a threshold. For this, the decision-maker selects and probes a subset of the processes to obtain noisy estimates of their states (normal or anomalous). Based on the received observations, the decisionmaker first determines whether to declare that the number of anomalies has exceeded the threshold or to continue taking observations. When the decision is to continue, it then decides whether to collect observations at the next time instant or defer it to a later time. If it chooses to collect observations, it further determines the subset of processes to be probed. To devise this three-step sequential decision-making process, we use a Bayesian formulation wherein we learn the posterior probability on the states of the processes. Using the posterior probability, we construct a Markov decision process and solve it using deep actor-critic reinforcement learning. Via numerical experiments, we demonstrate the superior performance of our algorithm compared to the traditional model-based algorithms.
Scalable and Decentralized Algorithms for Anomaly Detection via Learning-Based Controlled Sensing
Joseph, Geethu, Zhong, Chen, Gursoy, M. Cenk, Velipasalar, Senem, Varshney, Pramod K.
We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the processes to be observed at a given time instant, decides when to stop taking observations, and declares the decision on anomalous processes. The objective of the detection algorithm is to identify the anomalies with an accuracy exceeding the desired value while minimizing the delay in decision making. We devise a centralized algorithm where the processes are jointly selected by a common agent as well as a decentralized algorithm where the decision of whether to select a process is made independently for each process. Our algorithms rely on a Markov decision process defined using the marginal probability of each process being normal or anomalous, conditioned on the observations. We implement the detection algorithms using the deep actor-critic reinforcement learning framework. Unlike prior work on this topic that has exponential complexity in the number of processes, our algorithms have computational and memory requirements that are both polynomial in the number of processes. We demonstrate the efficacy of these algorithms using numerical experiments by comparing them with state-of-the-art methods.
Anomaly Detection and Sampling Cost Control via Hierarchical GANs
Zhong, Chen, Gursoy, M. Cenk, Velipasalar, Senem
Anomaly detection incurs certain sampling and sensing costs and therefore it is of great importance to strike a balance between the detection accuracy and these costs. In this work, we study anomaly detection by considering the detection of threshold crossings in a stochastic time series without the knowledge of its statistics. To reduce the sampling cost in this detection process, we propose the use of hierarchical generative adversarial networks (GANs) to perform nonuniform sampling. In order to improve the detection accuracy and reduce the delay in detection, we introduce a buffer zone in the operation of the proposed GAN-based detector. In the experiments, we analyze the performance of the proposed hierarchical GAN detector considering the metrics of detection delay, miss rates, average cost of error, and sampling ratio. We identify the tradeoffs in the performance as the buffer zone sizes and the number of GAN levels in the hierarchy vary. We also compare the performance with that of a sampling policy that approximately minimizes the sum of average costs of sampling and error given the parameters of the stochastic process. We demonstrate that the proposed GAN-based detector can have significant performance improvements in terms of detection delay and average cost of error with a larger buffer zone but at the cost of increased sampling rates.
Adversarial jamming attacks and defense strategies via adaptive deep reinforcement learning
Wang, Feng, Zhong, Chen, Gursoy, M. Cenk, Velipasalar, Senem
As the applications of deep reinforcement learning (DRL) in wireless communications grow, sensitivity of DRL based wireless communication strategies against adversarial attacks has started to draw increasing attention. In order to address such sensitivity and alleviate the resulting security concerns, we in this paper consider a victim user that performs DRL-based dynamic channel access, and an attacker that executes DRLbased jamming attacks to disrupt the victim. Hence, both the victim and attacker are DRL agents and can interact with each other, retrain their models, and adapt to opponents' policies. In this setting, we initially develop an adversarial jamming attack policy that aims at minimizing the accuracy of victim's decision making on dynamic channel access. Subsequently, we devise defense strategies against such an attacker, and propose three defense strategies, namely diversified defense with proportional-integral-derivative (PID) control, diversified defense with an imitation attacker, and defense via orthogonal policies. We design these strategies to maximize the attacked victim's accuracy and evaluate their performances.
A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access
Zhong, Chen, Lu, Ziyang, Gursoy, M. Cenk, Velipasalar, Senem
To make efficient use of limited spectral resources, we in this work propose a deep actor-critic reinforcement learning based framework for dynamic multichannel access. We consider both a single-user case and a scenario in which multiple users attempt to access channels simultaneously. We employ the proposed framework as a single agent in the single-user case, and extend it to a decentralized multi-agent framework in the multi-user scenario. In both cases, we develop algorithms for the actor-critic deep reinforcement learning and evaluate the proposed learning policies via experiments and numerical results. In the single-user model, in order to evaluate the performance of the proposed channel access policy and the framework's tolerance against uncertainty, we explore different channel switching patterns and different switching probabilities. In the case of multiple users, we analyze the probabilities of each user accessing channels with favorable channel conditions and the probability of collision. We also address a time-varying environment to identify the adaptive ability of the proposed framework. Additionally, we provide comparisons (in terms of both the average reward and time efficiency) between the proposed actor-critic deep reinforcement learning framework, Deep-Q network (DQN) based approach, random access, and the optimal policy when the channel dynamics are known.