Beyond Keywords: Evaluating Large Language Model Classification of Nuanced Ableism