Goto

Collaborating Authors

 rl technique


MushroomRL: Simplifying Reinforcement Learning Research

arXiv.org Machine Learning

MushroomRL is an open-source Python library developed to simplify the process of implementing and running Reinforcement Learning (RL) experiments. Compared to other available libraries, MushroomRL has been created with the purpose of providing a comprehensive and flexible framework to minimize the effort in implementing and testing novel RL methodologies. Indeed, the architecture of MushroomRL is built in such a way that every component of an RL problem is already provided, and most of the time users can only focus on the implementation of their own algorithms and experiments. The result is a library from which RL researchers can significantly benefit in the critical phase of the empirical analysis of their works. MushroomRL stable code, tutorials and documentation can be found at https://github.com/MushroomRL/mushroom-rl.


Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar

arXiv.org Machine Learning

Abstract--In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the -shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a con gested spectral environment, and the ability to share 100MHz spect rum with an uncooperative communications system. We examine po licy iteration, which solves an environment posed as a Markov Dec ision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well a s Deep RL techniques, which utilize a form of Q -Learning to approximate a parameterized function that is used by the rad ar to select optimal actions. We show that RL techniques are benefi cial over a Sense-and-A void (SAA) scheme and discuss the conditi ons under which each approach is most effective. The Third Generation Partnership Project (3GPP) has recently received FCC approval to support 5G New Radio (NR) operation in sub-6 GHz frequency bands that are heavily utilized by radar systems [1], [2]. Thus, there is a significa nt need for radar systems capable of dynamic spectrum sharing.


A deep learning approach to coordinate defensive escort teams

#artificialintelligence

Advancements in robotics and artificial intelligence (AI) are enabling the development of artificial agents designed to assist humans in a variety of everyday settings. One of the many possible uses for these systems could be to escort humans or valuable goods that are being transferred from one location to another, defending them from threats or attacks. Fascinated by this idea, a team of researchers at the University of New Mexico has recently introduced a new end-to-end solution for coordinating robotic escort teams that are protecting high-value payloads or goods. The technique they proposed, presented in a paper pre-published on arXiv, is based on deep reinforcement learning (RL), which entails training algorithms to make effective predictions by analyzing data. "I first came up with the idea behind this study when thinking about lugging my suitcase through a crowded airport," Lydia Tapia, the lead researcher on the study, told TechXplore.


Harnessing Structures for Value-Based Planning and Reinforcement Learning

arXiv.org Machine Learning

Value-based methods constitute a fundamental methodology in planning and deep reinforcement learning (RL). In this paper, we propose to exploit the underlying structures of the state-action value function, i.e., Q function, for both planning and deep RL. In particular, if the underlying system dynamics lead to some global structures of the Q function, one should be capable of inferring the function better by leveraging such structures. Specifically, we investigate the lowrank structure, which widely exists for big data matrices. We verify empirically the existence of low-rank Q functions in the context of control and deep RL tasks (Atari games). As our key contribution, by leveraging Matrix Estimation (ME) techniques, we propose a general framework to exploit the underlying low-rank structure in Q functions, leading to a more efficient planning procedure for classical control, and additionally, a simple scheme that can be applied to any value-based RL techniques to consistently achieve better performance on "low-rank" tasks. Extensive experiments on control tasks and Atari games confirm the efficacy of our approach.


RL will disrupt OR

#artificialintelligence

Operations research (OR) is in the initial stages of a revolution driven by reinforcement learning (RL). When I was at eHarmony years ago, we used classical OR techniques to drive the matchmaking process. Machine learning played a critical role, but was limited to specfiying parameters to the OR solver. In essence, machine learning was to used to estimate the value function, and the OR solver then produced a policy. OR has historically focused on highly tractable specializations of convex optimization.


A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

arXiv.org Artificial Intelligence

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system. However, a complete cloud resource allocation framework exhibits high dimensions in state and action spaces, which prohibit the usefulness of traditional RL techniques. In addition, high power consumption has become one of the critical concerns in design and control of cloud computing systems, which degrades system reliability and increases cooling cost. An effective dynamic power management (DPM) policy should minimize power consumption while maintaining performance degradation within an acceptable level. Thus, a joint virtual machine (VM) resource allocation and power management framework is critical to the overall cloud computing system. Moreover, novel solution framework is necessary to address the even higher dimensions in state and action spaces. In this paper, we propose a novel hierarchical framework for solving the overall resource allocation and power management problem in cloud computing systems. The proposed hierarchical framework comprises a global tier for VM resource allocation to the servers and a local tier for distributed power management of local servers. The emerging deep reinforcement learning (DRL) technique, which can deal with complicated control problems with large state space, is adopted to solve the global tier problem. Furthermore, an autoencoder and a novel weight sharing structure are adopted to handle the high-dimensional state space and accelerate the convergence speed. On the other hand, the local tier of distributed server power managements comprises an LSTM based workload predictor and a model-free RL based power manager, operating in a distributed manner.


Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

arXiv.org Artificial Intelligence

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.


Learning Reinforcement Learning (with Code, Exercises and Solutions)

#artificialintelligence

Skip all the talk and go directly to the Github Repo with code and exercises. Reinforcement Learning is one of the fields I'm most excited about. Over the past few years amazing results like learning to play Atari Games from raw pixels and Mastering the Game of Go have gotten a lot of attention, but RL is also widely used in Robotics, Image Processing and Natural Language Processing. Combining Reinforcement Learning and Deep Learning techniques works extremely well. Both fields heavily influence each other.