An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Open in new window