Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

Open in new window