Maximum Entropy Reinforcement Learning with Diffusion Policy