Evaluating Large Language Models as Expert Annotators