THiNK: Can Large Language Models Think-aloud?