Revisiting Classifier Two-Sample Tests