Evaluating Logit-Based GOP Scores for Mispronunciation Detection