Re-Thinking the Automatic Evaluation of Image-Text Alignment in Text-to-Image Models