Goto

Collaborating Authors

 Yang, Chenyang


Deep Reinforcement Learning Aided Packet-Routing For Aeronautical Ad-Hoc Networks Formed by Passenger Planes

arXiv.org Artificial Intelligence

Data packet routing in aeronautical ad-hoc networks (AANETs) is challenging due to their high-dynamic topology. In this paper, we invoke deep reinforcement learning for routing in AANETs aiming at minimizing the end-to-end (E2E) delay. Specifically, a deep Q-network (DQN) is conceived for capturing the relationship between the optimal routing decision and the local geographic information observed by the forwarding node. The DQN is trained in an offline manner based on historical flight data and then stored by each airplane for assisting their routing decisions during flight. To boost the learning efficiency and the online adaptability of the proposed DQN-routing, we further exploit the knowledge concerning the system's dynamics by using a deep value network (DVN) conceived with a feedback mechanism. Our simulation results show that both DQN-routing and DVN-routing achieve lower E2E delay than the benchmark protocol, and DVN-routing performs similarly to the optimal routing that relies on perfect global information.


Model-Free Unsupervised Learning for Optimization Problems with Constraints

arXiv.org Machine Learning

--In many optimization problems in wireless communications, the expressions of objective function or constraints are hard or even impossible to derive, which makes the solutions difficult to find. In this paper, we propose a model-free learning framework to solve constrained optimization problems without the supervision of the optimal solution. Neural networks are used respectively for parameterizing the function to be optimized, parameterizing the Lagrange multiplier associated with instantaneous constraints, and approximating the unknown objective function or constraints. We provide learning algorithms to train all the neural networks simultaneously, and reveal the connections of the proposed framework with reinforcement learning. Numerical and simulation results validate the proposed framework and demonstrate the efficiency of model-free learning by taking power control problem as an example. I NTRODUCTION V arious resource allocation and transceivers in wireless networks, such as power allocation, beamforming, and caching policy, can be designed by solving optimization problems with constraints, say imposed by the maximal transmit power, cache size, and the minimal data rate requirement [1, 2]. Depending on the applications, the objective function, constraints and the policy to be optimized may vary in different timescales.


Unsupervised Deep Learning for Ultra-reliable and Low-latency Communications

arXiv.org Machine Learning

In this paper, we study how to solve resource allocation problems in ultra-reliable and low-latency communications by unsupervised deep learning, which often yield functional optimization problems with quality-of-service (QoS) constraints. We take a joint power and bandwidth allocation problem as an example, which minimizes the total bandwidth required to guarantee the QoS of each user in terms of the delay bound and overall packet loss probability. The global optimal solution is found in a symmetric scenario. A neural network was introduced to find an approximated optimal solution in general scenarios, where the QoS is ensured by using the property that the optimal solution should satisfy as the "supervision signal". Simulation results show that the learning-based solution performs the same as the optimal solution in the symmetric scenario, and can save around 40% bandwidth with respect to the state-of-the-art policy.


Learning to Optimize with Unsupervised Learning: Training Deep Neural Networks for URLLC

arXiv.org Machine Learning

Learning the optimized solution as a function of environmental parameters is effective in solving numerical optimization in real time for time-sensitive applications. Existing works of learning to optimize train deep neural networks (DNN) with labels, and the learnt solution are inaccurate, which cannot be employed to ensure the stringent quality of service. In this paper, we propose a framework to learn the latent function with unsupervised deep learning, where the property that the optimal solution should satisfy is used as the "supervision signal" implicitly. The framework is applicable to both functional and variable optimization problems with constraints. We take a variable optimization problem in ultra-reliable and low-latency communications as an example, which demonstrates that the ultra-high reliability can be supported by the DNN without supervision labels.