Provably Efficient Long-Horizon Exploration in Monte Carlo Tree Search through State Occupancy Regularization

Open in new window