Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric

Open in new window