Ordered Semantically Diverse Sampling for Textual Data