Goto

Collaborating Authors

 Telecommunications


Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets

arXiv.org Artificial Intelligence

Bayesian optimization (BO) is widely adopted in black-box optimization problems and it relies on a surrogate model to approximate the black-box response function. With the increasing number of black-box optimization tasks solved and even more to solve, the ability to learn from multiple prior tasks to jointly pre-train a surrogate model is long-awaited to further boost optimization efficiency. In this paper, we propose a simple approach to pre-train a surrogate, which is a Gaussian process (GP) with a kernel defined on deep features learned from a Transformerbased encoder, using datasets from prior tasks with possibly heterogeneous input spaces. In addition, we provide a simple yet effective mix-up initialization strategy for input tokens corresponding to unseen input variables and therefore accelerate new tasks' convergence. Experiments on both synthetic and real benchmark problems demonstrate the effectiveness of our proposed pre-training and transfer BO strategy over existing methods. In black-box optimization problems, one could only observe outputs of the function being optimized based on some given inputs, and can hardly access the explicit form of the function. These kinds of optimization problems are ubiquitous in practice (e.g., (Mahapatra et al., 2015; Korovina et al., 2020; Griffiths & Lobato, 2020)). Among black-box optimization problems, some are particularly challenging since their function evaluations are expensive, in the sense that the evaluation either takes a substantial amount of time or requires a considerable monetary cost.


A Survey on Decentralized Federated Learning

arXiv.org Artificial Intelligence

In recent years, federated learning (FL) has become a very popular paradigm for training distributed, large-scale, and privacy-preserving machine learning (ML) systems. In contrast to standard ML, where data must be collected at the exact location where training is performed, FL takes advantage of the computational capabilities of millions of edge devices to collaboratively train a shared, global model without disclosing their local private data. Specifically, in a typical FL system, the central server acts only as an orchestrator; it iteratively gathers and aggregates all the local models trained by each client on its private data until convergence. Although FL undoubtedly has several benefits over traditional ML (e.g., it protects private data ownership by design), it suffers from several weaknesses. One of the most critical challenges is to overcome the centralized orchestration of the classical FL client-server architecture, which is known to be vulnerable to single-point-of-failure risks and man-in-the-middle attacks, among others. To mitigate such exposure, decentralized FL solutions have emerged where all FL clients cooperate and communicate without a central server. This survey comprehensively summarizes and reviews existing decentralized FL approaches proposed in the literature. Furthermore, it identifies emerging challenges and suggests promising research directions in this under-explored domain.


Generalization bound for estimating causal effects from observational network data

arXiv.org Artificial Intelligence

Estimating causal effects from observational network data is a significant but challenging problem. Existing works in causal inference for observational network data lack an analysis of the generalization bound, which can theoretically provide support for alleviating the complex confounding bias and practically guide the design of learning objectives in a principled manner. To fill this gap, we derive a generalization bound for causal effect estimation in network scenarios by exploiting 1) the reweighting schema based on joint propensity score and 2) the representation learning schema based on Integral Probability Metric (IPM). We provide two perspectives on the generalization bound in terms of reweighting and representation learning, respectively. Motivated by the analysis of the bound, we propose a weighting regression method based on the joint propensity score augmented with representation learning. Extensive experimental studies on two real-world networks with semi-synthetic data demonstrate the effectiveness of our algorithm.


Cooperative Multi-Type Multi-Agent Deep Reinforcement Learning for Resource Management in Space-Air-Ground Integrated Networks

arXiv.org Artificial Intelligence

The Space-Air-Ground Integrated Network (SAGIN), integrating heterogeneous devices including low earth orbit (LEO) satellites, unmanned aerial vehicles (UAVs), and ground users (GUs), holds significant promise for advancing smart city applications. However, resource management of the SAGIN is a challenge requiring urgent study in that inappropriate resource management will cause poor data transmission, and hence affect the services in smart cities. In this paper, we develop a comprehensive SAGIN system that encompasses five distinct communication links and propose an efficient cooperative multi-type multi-agent deep reinforcement learning (CMT-MARL) method to address the resource management issue. The experimental results highlight the efficacy of the proposed CMT-MARL, as evidenced by key performance indicators such as the overall transmission rate and transmission success rate. These results underscore the potential value and feasibility of future implementation of the SAGIN.


An AI-Enabled Framework to Defend Ingenious MDT-based Attacks on the Emerging Zero Touch Cellular Networks

arXiv.org Artificial Intelligence

Deep automation provided by self-organizing network (SON) features and their emerging variants such as zero touch automation solutions is a key enabler for increasingly dense wireless networks and pervasive Internet of Things (IoT). To realize their objectives, most automation functionalities rely on the Minimization of Drive Test (MDT) reports. The MDT reports are used to generate inferences about network state and performance, thus dynamically change network parameters accordingly. However, the collection of MDT reports from commodity user devices, particularly low cost IoT devices, make them a vulnerable entry point to launch an adversarial attack on emerging deeply automated wireless networks. This adds a new dimension to the security threats in the IoT and cellular networks. Existing literature on IoT, SON, or zero touch automation does not address this important problem. In this paper, we investigate an impactful, first of its kind adversarial attack that can be launched by exploiting the malicious MDT reports from the compromised user equipment (UE). We highlight the detrimental repercussions of this attack on the performance of common network automation functions. We also propose a novel Malicious MDT Reports Identification framework (MRIF) as a countermeasure to detect and eliminate the malicious MDT reports using Machine Learning and verify it through a use-case. Thus, the defense mechanism can provide the resilience and robustness for zero touch automation SON engines against the adversarial MDT attacks


Successive Pose Estimation and Beam Tracking for mmWave Vehicular Communication Systems

arXiv.org Artificial Intelligence

The millimeter wave (mmWave) radar sensing-aided communications in vehicular mobile communication systems is investigated. To alleviate the beam training overhead under high mobility scenarios, a successive pose estimation and beam tracking (SPEBT) scheme is proposed to facilitate mmWave communications with the assistance of mmWave radar sensing. The proposed SPEBT scheme first resorts to a Fast Conservative Filtering for Efficient and Accurate Radar odometry (Fast-CFEAR) approach to estimate the vehicle pose consisting of 2-dimensional position and yaw from radar point clouds collected by mmWave radar sensor. Then, the pose estimation information is fed into an extend Kalman filter to perform beam tracking for the line-of-sight channel. Owing to the intrinsic robustness of mmWave radar sensing, the proposed SPEBT scheme is capable of operating reliably under extreme weather/illumination conditions and large-scale global navigation satellite systems (GNSS)-denied environments. The practical deployment of the SPEBT scheme is verified through rigorous testing on a real-world sensing dataset. Simulation results demonstrate that the proposed SPEBT scheme is capable of providing precise pose estimation information and accurate beam tracking output, while reducing the proportion of beam training overhead to less than 5% averagely.


Universal Approximation of Linear Time-Invariant (LTI) Systems through RNNs: Power of Randomness in Reservoir Computing

arXiv.org Artificial Intelligence

Recurrent neural networks (RNNs) are known to be universal approximators of dynamic systems under fairly mild and general assumptions, making them good tools to process temporal information. However, RNNs usually suffer from the issues of vanishing and exploding gradients in the standard RNN training. Reservoir computing (RC), a special RNN where the recurrent weights are randomized and left untrained, has been introduced to overcome these issues and has demonstrated superior empirical performance in fields as diverse as natural language processing and wireless communications especially in scenarios where training samples are extremely limited. On the contrary, the theoretical grounding to support this observed performance has not been fully developed at the same pace. In this work, we show that RNNs can provide universal approximation of linear time-invariant (LTI) systems. Specifically, we show that RC can universally approximate a general LTI system. We present a clear signal processing interpretation of RC and utilize this understanding in the problem of simulating a generic LTI system through RC. Under this setup, we analytically characterize the optimal probability distribution function for generating the recurrent weights of the underlying RNN of the RC. We provide extensive numerical evaluations to validate the optimality of the derived optimum distribution of the recurrent weights of the RC for the LTI system simulation problem. Our work results in clear signal processing-based model interpretability of RC and provides theoretical explanation for the power of randomness in setting instead of training RC's recurrent weights. It further provides a complete optimum analytical characterization for the untrained recurrent weights, marking an important step towards explainable machine learning (XML) which is extremely important for applications where training samples are limited.


Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data

arXiv.org Machine Learning

Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous statistical test for whether two high-dimensional single-cell datasets are alignable (and therefore should even be aligned). Moreover, popular methods can substantially distort the data during alignment, making the aligned data and downstream analysis difficult to interpret. To overcome these limitations, we present a spectral manifold alignment and inference (SMAI) framework, which enables principled and interpretable alignability testing and structure-preserving integration of single-cell data. SMAI provides a statistical test to robustly determine the alignability between datasets to avoid misleading inference, and is justified by high-dimensional statistical theory. On a diverse range of real and simulated benchmark datasets, it outperforms commonly used alignment methods. Moreover, we show that SMAI improves various downstream analyses such as identification of differentially expressed genes and imputation of single-cell spatial transcriptomics, providing further biological insights. SMAI's interpretability also enables quantification and a deeper understanding of the sources of technical confounders in single-cell data.


CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters

arXiv.org Artificial Intelligence

We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters. CASSINI introduces a novel geometric abstraction to consider the communication pattern of different jobs while placing them on network links. To do so, CASSINI uses an affinity graph that finds a series of time-shift values to adjust the communication phases of a subset of jobs, such that the communication patterns of jobs sharing the same network link are interleaved with each other. Experiments with 13 common ML models on a 24-server testbed demonstrate that compared to the state-of-the-art ML schedulers, CASSINI improves the average and tail completion time of jobs by up to 1.6x and 2.5x, respectively. Moreover, we show that CASSINI reduces the number of ECN marked packets in the cluster by up to 33x.


Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

arXiv.org Artificial Intelligence

Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research.