Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation