Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning