Preference-based Reinforcement Learning with Finite-Time Guarantees