Intentional Control of Type I Error over Unconscious Data Distortion: a Neyman-Pearson Approach to Text Classification