Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation