Data-Informed Global Sparseness in Attention Mechanisms for Deep Neural Networks

Open in new window