Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning