Accuracy is Not Agreement: Expert-Aligned Evaluation of Crash Narrative Classification Models