Zhu, Jingge
Non-Asymptotic Bounds for Closed-Loop Identification of Unstable Nonlinear Stochastic Systems
Siriya, Seth, Zhu, Jingge, Nešić, Dragan, Pu, Ye
We consider the problem of least squares parameter estimation from single-trajectory data for discrete-time, unstable, closed-loop nonlinear stochastic systems, with linearly parameterised uncertainty. Assuming a region of the state space produces informative data, and the system is sub-exponentially unstable, we establish non-asymptotic guarantees on the estimation error at times where the state trajectory evolves in this region. If the whole state space is informative, high probability guarantees on the error hold for all times. Examples are provided where our results are useful for analysis, but existing results are not.
Graph Neural Networks for Power Allocation in Wireless Networks with Full Duplex Nodes
Chen, Lili, Zhu, Jingge, Evans, Jamie
Due to mutual interference between users, power allocation problems in wireless networks are often non-convex and computationally challenging. Graph neural networks (GNNs) have recently emerged as a promising approach to tackling these problems and an approach that exploits the underlying topology of wireless networks. In this paper, we propose a novel graph representation method for wireless networks that include full-duplex (FD) nodes. We then design a corresponding FD Graph Neural Network (F-GNN) with the aim of allocating transmit powers to maximise the network throughput. Our results show that our F-GNN achieves state-of-art performance with significantly less computation time. Besides, F-GNN offers an excellent trade-off between performance and complexity compared to classical approaches. We further refine this trade-off by introducing a distance-based threshold for inclusion or exclusion of edges in the network. We show that an appropriately chosen threshold reduces required training time by roughly 20% with a relatively minor loss in performance.
Accelerating Graph Neural Networks via Edge Pruning for Power Allocation in Wireless Networks
Chen, Lili, Zhu, Jingge, Evans, Jamie
Neural Networks (GNNs) have recently emerged as a promising approach to tackling power allocation problems in wireless networks. Since unpaired transmitters and receivers are often spatially distant, the distanced-based threshold is proposed to reduce the computation time by excluding or including the channel state information in GNNs. In this paper, we are the first to introduce a neighbour-based threshold approach to GNNs to reduce the time complexity. Furthermore, we conduct a comprehensive analysis of both distance-based and neighbour-based thresholds and provide recommendations for selecting the appropriate value in different communication channel scenarios. We design the corresponding distance-based and neighbour-based Graph Neural Networks with the aim of allocating transmit powers to maximise the network throughput. Our results show that our proposed GNNs offer significant advantages in terms of reducing time complexity while preserving strong performance. Besides, we show that by choosing a suitable threshold, the time complexity is reduced from O(|V|^2) to O(|V|), where |V| is the total number of transceiver pairs.
Stability Bounds for Learning-Based Adaptive Control of Discrete-Time Multi-Dimensional Stochastic Linear Systems with Input Constraints
Siriya, Seth, Zhu, Jingge, Nešić, Dragan, Pu, Ye
We consider the problem of adaptive stabilization for discrete-time, multi-dimensional linear systems with bounded control input constraints and unbounded stochastic disturbances, where the parameters of the true system are unknown. To address this challenge, we propose a certainty-equivalent control scheme which combines online parameter estimation with saturated linear control. We establish the existence of a high probability stability bound on the closed-loop system, under additional assumptions on the system and noise processes. Finally, numerical examples are presented to illustrate our results. Adaptive control (AC) is concerned with the design of controllers for dynamical systems whose model parameters are unknown.
On the tightness of information-theoretic bounds on generalization error of learning algorithms
Wu, Xuetong, Manton, Jonathan H., Aickelin, Uwe, Zhu, Jingge
A recent line of works, initiated by [1] and [2], has shown that the generalization error of a learning algorithm can be upper bounded by information measures. In most of the relevant works, the convergence rate of the expected generalization error is in the form of O( λ/n) where λ is some information-theoretic quantities such as the mutual information or conditional mutual information between the data and the learned hypothesis. However, such a learning rate is typically considered to be "slow", compared to a "fast rate" of O(λ/n) in many learning scenarios. In this work, we first show that the square root does not necessarily imply a slow rate, and a fast rate result can still be obtained using this bound under appropriate assumptions. Furthermore, we identify the critical conditions needed for the fast rate generalization error, which we call the (η, c)-central condition. Under this condition, we give information-theoretic bounds on the generalization error and excess risk, with a fast convergence rate for specific learning algorithms such as empirical risk minimization and its regularized version. Finally, several analytical examples are given to show the effectiveness of the bounds. The generalization error of a learning algorithm lies in the core analysis of the statistical learning theory, and the estimation of which becomes remarkably crucial.
On the Value of Stochastic Side Information in Online Learning
Jia, Junzhang, Wu, Xuetong, Zhu, Jingge, Evans, Jamie
As a common situation in practice, the forecaster could access some additional resources which we call it side information, We study the effectiveness of stochastic side information in deterministic that may provide some useful knowledge on the online learning scenarios. We propose a forecaster sequence of interest. Cover and Ordentlich [10] first studied to predict a deterministic sequence where its performance is a portfolio investment problem where the sequence of interest evaluated against an expert class. We assume that certain is the stock vectors that may depend on some finite-valued stochastic side information is available to the forecaster but states (as side information), and their proposed forecaster can not the experts. We define the minimax expected regret for achieve the same wealth as the best side information dependent evaluating the forecaster's performance, for which we obtain investment strategy. Xie and Barron [11] studied the case when both upper and lower bounds. Consequently, our results characterize the sequence of interest is generated according to a pair-wise the improvement in the regret due to the stochastic parametric distribution conditioning on the side information, side information. Compared with the classical online learning and derived an logarithmic upper bound of the minimax regret.
Semi-Supervised Learning: the Case When Unlabeled Data is Equally Useful
Zhu, Jingge
Semi-supervised learning algorithms attempt to take advantage of relatively inexpensive unlabeled data to improve learning performance. In this work, we consider statistical models where the data distributions can be characterized by continuous parameters. We show that under certain conditions on the distribution, unlabeled data is equally useful as labeled date in terms of learning rate. Specifically, let $n, m$ be the number of labeled and unlabeled data, respectively. It is shown that the learning rate of semi-supervised learning scales as $O(1/n)$ if $m\sim n$, and scales as $O(1/n^{1+\gamma})$ if $m\sim n^{1+\gamma}$ for some $\gamma>0$, whereas the learning rate of supervised learning scales as $O(1/n)$.
Information-theoretic analysis for transfer learning
Wu, Xuetong, Manton, Jonathan H., Aickelin, Uwe, Zhu, Jingge
Transfer learning, or domain adaptation, is concerned with machine learning problems in which training and testing data come from possibly different distributions (denoted as $\mu$ and $\mu'$, respectively). In this work, we give an information-theoretic analysis on the generalization error and the excess risk of transfer learning algorithms, following a line of work initiated by Russo and Zhou. Our results suggest, perhaps as expected, that the Kullback-Leibler (KL) divergence $D(mu||mu')$ plays an important role in characterizing the generalization error in the settings of domain adaptation. Specifically, we provide generalization error upper bounds for general transfer learning algorithms and extend the results to a specific empirical risk minimization (ERM) algorithm where data from both distributions are available in the training phase. We further apply the method to iterative, noisy gradient descent algorithms, and obtain upper bounds which can be easily calculated, only using parameters from the learning algorithms. A few illustrative examples are provided to demonstrate the usefulness of the results. In particular, our bound is tighter in specific classification problems than the bound derived using Rademacher complexity.