Gzip versus bag-of-words for text classification