5d2c2cee8ab0b9a36bd1ed7196bd6c4a-Paper.pdf

Neural Information Processing Systems 

We study theregretincurred bytheagent, firstwhen sheknowsherrewardfunction but does not know the distribution of the task duration, and then when she does not knowher reward function, either.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found