r/MachineLearning - [D] Reinforcement learning measuring ground truth

@machinelearnbot 

Analyzing the performance of the "ground truth" agent will vary in difficulty based on the task, in terms of a stochastic vs. deterministic environment, how obvious the reward function is (as simple as distance traveled, or more difficult like the score in Tetris), etc. A "ground truth" agent implies perfect performance which is extremely difficult to obtain for any but the most simple environments. If you are mainly just interested in looking at how to model an agent's behaviors, then the performance of the "ground truth" agent maybe won't matter. But if the performance of the "ground truth" agent does matter (it is part of an evolutionary process or something), then perhaps you could do something like record your own actions at the task (if doable), or compare the score to that of some baseline. Can you share more details about your project, like the environment you're using, what exactly you're trying to get out of it, what the project is for, etc.? This will help to get you a better answer.