Soft Diffusion Actor-Critic: Efficient Online Reinforcement Learning for Diffusion Policy