Leveraging Large Language Models for NLG Evaluation: A Survey