Active Advantage-Aligned Online Reinforcement Learning with Offline Data