58a799d16fb0c1f2014e98f4ba972b25-Paper-Conference.pdf

Neural Information Processing Systems 

RL that utilize function approximation to generalize observational data to unknown states/actions. The goal of this paper is to study the sample complexity of policy-based RL, which is arguably the simplest setting for RL with function approximation (Kearns et al., 1999; Kakade, 2003).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found