Is GPT-4 a reliable rater? Evaluating Consistency in GPT-4 Text Ratings