A Fast Randomized Algorithm for Massive Text Normalization