Provably Correct Optimization and Exploration with Non-linear Policies

Open in new window