Reinforcement Learning for Everybody
As with many other machine learning, or more generally, AI problems, RL can also be intimidating if one starts directly from the full problem and the formal mathematical definitions, so let us start by loosely defining RL as a collection of both problems and representations, meaning that, we have both RL problems and RL methods to solve that class of problems. More formally, when we are working on a reinforcement learning problem, we are trying to map specific situations to an action or a set of actions, and each of those actions will have a consequence or a "reward" which can be either positive, neutral, or negative, in fact, this can simply be a real number. For example, let's say that we have a pet monkey called Marcel and that he has a set of toys that he loves to play with, and let's say that we want to teach Marcel to pee in the toilet as opposed to on the floor, so to incentivize Marcel too choose the right action, we'll give him a new toy every time he pees in the toilet ( 1 toy) and we'll remove a toy from his collection (-1 toy) every time he pees on the floor. In this case, hopefully, Marcel (we can call him the "agent"), will learn to select an "action" (pee on the floor vs pee in the toilet) whenever he finds himself in a given situation or "state" -- when he feels the need to pee -- in a way to maximize the number of toys, namely the rewards, by choosing the right actions at that state. Now, I want to emphasize that while this example does a decent job describing the general idea of a reinforcement learning problem, there are many elements missing to fully describe the RL problem.
Dec-6-2021, 10:05:07 GMT