Goto

Collaborating Authors

 Reinforcement Learning


Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management

arXiv.org Artificial Intelligence

In this paper, an interference-aware path planning scheme for a network of cellular-connected unmanned aerial vehicles (UAVs) is proposed. In particular, each UAV aims at achieving a tradeoff between maximizing energy efficiency and minimizing both wireless latency and the interference level caused on the ground network along its path. The problem is cast as a dynamic game among UAVs. To solve this game, a deep reinforcement learning algorithm, based on echo state network (ESN) cells, is proposed. The introduced deep ESN architecture is trained to allow each UAV to map each observation of the network state to an action, with the goal of minimizing a sequence of time-dependent utility functions. Each UAV uses ESN to learn its optimal path, transmission power level, and cell association vector at different locations along its path. The proposed algorithm is shown to reach a subgame perfect Nash equilibrium (SPNE) upon convergence. Moreover, an upper and lower bound for the altitude of the UAVs is derived thus reducing the computational complexity of the proposed algorithm. Simulation results show that the proposed scheme achieves better wireless latency per UAV and rate per ground user (UE) while requiring a number of steps that is comparable to a heuristic baseline that considers moving via the shortest distance towards the corresponding destinations. The results also show that the optimal altitude of the UAVs varies based on the ground network density and the UE data rate requirements and plays a vital role in minimizing the interference level on the ground UEs as well as the wireless transmission delay of the UAV.


Salesforce research

#artificialintelligence

Deep reinforcement learning (deep RL) is a popular and successful family of methods for teaching computers tasks ranging from playing Go and Atari games to controlling industrial robots. But it is difficult to use a single neural network and conventional RL techniques to learn many different skills at once. Existing approaches usually treat the tasks independently or attempt to transfer knowledge between a pair of tasks, but this prevents full exploration of the underlying relationships between different tasks. When humans learn new skills, we take advantage of our existing skills and build new capabilities by composing and combining simpler ones. For instance, learning multi-digit multiplication relies on knowledge of single-digit multiplication, while knowing how to properly prepare individual ingredients facilitates cooking dishes with complex recipes.


Global Convergence of Policy Gradient Methods for Linearized Control Problems

arXiv.org Machine Learning

Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an "end-to-end" approach, directly optimizing the performance metric of interest 3) they inherently allow for richly parameterized policies. A notable drawback is that even in the most basic continuous control problem (that of linear quadratic regulators), these methods must solve a non-convex optimization problem, where little is understood about their efficiency from both computational and statistical perspectives. In contrast, system identification and model based planning in optimal control theory have a much more solid theoretical footing, where much is known with regards to their computational and statistical properties. This work bridges this gap showing that (model free) policy gradient methods globally converge to the optimal solution and are efficient (polynomially so in relevant problem dependent quantities) with regards to their sample and computational complexities.


Monte Carlo Prediction

#artificialintelligence

Siraj Raval programs a virtual robot to do some house cleaning using a technique called Monte Carlo Prediction. In typical Siraj fashion he explains what it is, how it works and how to use it for reinforcement learning.


7 Books About Machine Learning, Statistics, and Python

#artificialintelligence

Complex statistics in Machine Learning worry a lot of developers. Knowing statistics helps you build strong Machine Learning models that are optimized for a given problem statement. This book will teach you all it takes to perform complex statistical computations required for Machine Learning. You will gain information on statistics behind supervised learning, unsupervised learning, reinforcement learning, and more. Understand the real-world examples that discuss the statistical side of Machine Learning and familiarize yourself with it.


Introduction to Various Reinforcement Learning Algorithms. Part I (Q-Learning, SARSA, DQN, DDPG)

@machinelearnbot

Typically, a RL setup is composed of two components, an agent and an environment. Then environment refers to the object that the agent is acting on (e.g. the game itself in the Atari game), while the agent represents the RL algorithm. The environment starts by sending a state to the agent, which then based on its knowledge to take an action in response to that state. After that, the environment send a pair of next state and reward back to the agent. The agent will update its knowledge with the reward returned by the environment to evaluate its last action.


[D] Introduction to Various Reinforcement Learning Algorithms. Part I (Q-Learning, SARSA, DQN, DDPG) โ€ข r/MachineLearning

@machinelearnbot

I should have mentioned that model-based learning allows the agent to plan ahead. For that statement, I am talking about the transition probability T(s', s, a). You are going from current state s to the next state s' after taking action a, and you have to store all the combinations. I will be very appreciated if you can point out the typo lol.


Predictions for Artificial Intelligence in 2018 Positive reinforcement Reinf...

#artificialintelligence

Predictions for Artificial Intelligence in 2018 Positive reinforcement Reinforcement learning takes inspiration from the ways that animals learn how certain behaviors tend to result in a positive or negative outcome. Using this approach, a computer can say, figure out how to navigate a maze by trial and error and then associate the positive outcome--exiting the maze--with the actions that led up to it. This lets a machine learn without instruction or even explicit examples. The idea has been around for decades, but combining it with large (or deep) neural networks provides the power needed to make it work on really complex problems (like the game of Go). Through relentless experimentation, as well as analysis of previous games, AlphaGo figured out for itself how to play the game at an expert level.


Types of Machine Learning Algorithms

#artificialintelligence

Reinforcement learning sits somewhere in between supervised and unsupervised learning. You know the parts of the truth or output, but not the whole truth. Based on that you teach a computer algorithm to perform some action. If right, the action is rewarded; if wrong, the action is punished. Based on this reward system, the computer learns to know whether what it did was right or wrong.


Deep learning to generate revenue for airlines

#artificialintelligence

Deep Reinforcement Learning (RL) is used to help airlines improve their business. So, revenue management (RM) is for maximizing revenue for airlines. Revenue management (RM) first used forecasting traffic flows (customer volumes and willingness to pay), and an optimisation procedure that prioritises among customers by selecting optimal availabilities, or prices. But, revenue management (RM) makes many (and unrealistic) assumptions. Deep Reinforcement learning (RL) is an area of deep learning focused on learning, and receiving feedback in order to optimize its predictions.