Goto

Collaborating Authors

 Reinforcement Learning


Reinforcement Learning and AI

@machinelearnbot

Summary: At the core of modern AI, particularly robotics, and sequential tasks is Reinforcement Learning. Although RL has been around for many years it has become the third leg of the Machine Learning stool and increasingly important for Data Scientist to know when and how to implement. If you poled a group of data scientist just a few years back about how many machine learning problem types there are you would almost certainly have gotten a binary response: problem types were clearly divided into supervised and unsupervised. While Reinforcement Learning (RL) has been around since at least the 80's and before that in the behavioral sciences, its introduction as a major player in machine learning reflects it rising importance in AI. What problems fit this description?


Exploiting generalization in the subspaces for faster model-based learning

arXiv.org Machine Learning

Due to the lack of enough generalization in the state-space, common methods in Reinforcement Learning (RL) suffer from slow learning speed especially in the early learning trials. This paper introduces a model-based method in discrete state-spaces for increasing learning speed in terms of required experience (but not required computational time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation (full-space). Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the full-space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called Model Based Learning with Subspaces (MoBLeS), calculates confidence intervals of the estimated Q-values in the full-space and in the subspaces. These confidence intervals are used in the decision making, such that the agent benefits the most from the possible generalization while avoiding from detriment of the perceptual aliasing in the subspaces. Convergence of MoBLeS to the optimal policy is theoretically investigated. Additionally, we show through several experiments that MoBLeS improves the learning speed in the early trials.


[P] Verification of Reinforcement Learning โ€ข r/MachineLearning

@machinelearnbot

I am currently taking a course in the verification of cyber-physical systems. When I say that, think formal and probabilistic verification of state machines for safety. It's a graduate course and the professor wants us all to do a large project. Anything that somewhat relates to the course material is fair game. I thought about mixing it together with machine learning.


Insulin Regimen ML-based control for T2DM patients

arXiv.org Machine Learning

\begin{abstract} We model individual T2DM patient blood glucose level (BGL) by stochastic process with discrete number of states mainly but not solely governed by medication regimen (e.g. insulin injections). BGL states change otherwise according to various physiological triggers which render a stochastic, statistically unknown, yet assumed to be quasi-stationary, nature of the process. In order to express incentive for being in desired healthy BGL we heuristically define a reward function which returns positive values for desirable BG levels and negative values for undesirable BG levels. The state space consists of sufficient number of states in order to allow for memoryless assumption. This, in turn, allows to formulate Markov Decision Process (MDP), with an objective to maximize the total reward, summarized over a long run. The probability law is found by model-based reinforcement learning (RL) and the optimal insulin treatment policy is retrieved from MDP solution.


Academic, Research Positions in Big Data, Data Mining, Data Science

@machinelearnbot

Samuel Kaski) - One of the core questions in machine learning at the moment is how to interact with humans. We turn this question into a probabilistic modelling problem, and model both the user and the task to drive the interaction. The solutions need combinations of probabilistic modelling, reinforcement learning and approximate Bayesian computation. We are looking for a postdoc who already masters some of these and offer an opportunity to learn the rest and work with us on this exciting bleeding-edge problem. Antti Oulasvirta) - The position offers an exciting opportunity to learn about and work on applications of machine learning methods and computational models of cognition, perception, and behavior in interactive systems.


Can Machine Learning Be Applied To The Problem Of Trading?

International Business Times

A new academic paper, Machine Learning for Trading, is the first conclusive study that shows success from having a machine learning-based trading strategy. The author, Gordon Ritter, Adjunct Professor in the Mathematics in Finance Program, New York University, constructed an artificial system which he knew would admit a profitable strategy, to see if a machine would find it. Newsweek is hosting an AI and Data Science in Capital Markets conference in NYC, Dec. 6-7. In order to train a machine learning algorithm to behave as a rational risk-averse investor required appropriate reinforcement learning, specifically a mathematical technique called Q-learning (playing some sort of game where you are trying to maximise the reward function that may occur at several periods in the future). The machine learning agent found and exploited arbitrage opportunities in the presence of transaction costs in a simulated market proof of concept.


?platform=hootsuite

#artificialintelligence

We are looking for a Machine Learning Researcher with a specialised focus on Reinforcement and Active Learning. The candidate will have a sound understanding of modern machine learning, deep learning, probabilistic modelling techniques and expertise in Reinforcement and Active Learning and their applications in real-world problems. You will have the opportunity to contribute to this high performing team who seek to apply their knowledge in the high impact field of improving human's capability in drug discovery. If this challenge and opportunity excites you, please email your CV and a covering letter to careers@benevolent.ai


If machine learning can be applied to trading, what will it mean for humans?

#artificialintelligence

A new academic paper, Machine Learning for Trading, is the first conclusive study that shows success in having a machine learning-based trading strategy. The author, Gordon Ritter, Adjunct Professor in the Mathematics in Finance Program, New York University, constructed an artificial system which he knew would admit a profitable strategy, to see if a machine would find it. In order to train a machine-learning algorithm to behave as a rational risk-averse investor required appropriate reinforcement learning, specifically a mathematical technique called Q-learning (playing some sort of game where you are trying to maximise the reward function that may occur at several periods in the future). The machine learning agent found and exploited arbitrage opportunities in the presence of transaction costs in a simulated market proof of concept. Ritter explained: "I was really trying to answer the question, does machine learning have any application to trading at all, or no application; sort of a binary question. Can machine learning be applied to the problem of trading? "I reasoned that in a system that I know admits a profitable trading strategy, because I constructed it that way, can the machine find it.


The Effects of Memory Replay in Reinforcement Learning

arXiv.org Machine Learning

Experience replay is a key technique behind many recent advances in deep reinforcement learning. Allowing the agent to learn from earlier memories can speed up learning and break undesirable temporal correlations. Despite its wide-spread application, very little is understood about the properties of experience replay. How does the amount of memory kept affect learning dynamics? Does it help to prioritize certain experiences? In this paper, we address these questions by formulating a dynamical systems ODE model of Q-learning with experience replay. We derive analytic solutions of the ODE for a simple setting. We show that even in this very simple setting, the amount of memory kept can substantially affect the agent's performance. Too much or too little memory both slow down learning. Moreover, we characterize regimes where prioritized replay harms the agent's learning. We show that our analytic solutions have excellent agreement with experiments. Finally, we propose a simple algorithm for adaptively changing the memory buffer size which achieves consistently good empirical performance.


Fintech: Can machine learning be applied to trading?

@machinelearnbot

A new academic paper, Machine Learning for Trading, is the first conclusive study that shows success in having a machine learning-based trading strategy. The author, Gordon Ritter, Adjunct Professor in the Mathematics in Finance Program, New York University, constructed an artificial system which he knew would admit a profitable strategy, to see if a machine would find it. In order to train a machine-learning algorithm to behave as a rational risk-averse investor required appropriate reinforcement learning, specifically a mathematical technique called Q-learning (playing some sort of game where you are trying to maximise the reward function that may occur at several periods in the future). The machine learning agent found and exploited arbitrage opportunities in the presence of transaction costs in a simulated market proof of concept. Ritter explained: "I was really trying to answer the question, does machine learning have any application to trading at all, or no application; sort of a binary question. Can machine learning be applied to the problem of trading? "I reasoned that in a system that I know admits a profitable trading strategy, because I constructed it that way, can the machine find it.