Speech Recognition Model Improves Text-to-Speech Synthesis using Fine-Grained Reward