Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting