Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric