Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping

Open in new window