Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives