Not All Attention Is Needed: Gated Attention Network for Sequence Data