LowerBound

Neural Information Processing Systems 

Then, we consider sufficient assumptions under which learning good policies requires polynomial number of episodes.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found