Reward-Directed Conditional Diffusion: Provable Distribution Estimation and Reward Improvement