Hierarchical Transformers for Long Document Classification