Goto

Collaborating Authors

 dmodel


Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation

Tianyu He, Xu Tan, Yingce Xia, Di He, Tao Qin, Zhibo Chen, Tie-Yan Liu

Neural Information Processing Systems

Neural Machine Translation (NMT) has achieved remarkable progress with the quick evolvement of model structures. In this paper, we propose the concept of layer-wise coordination for NMT, which explicitly coordinates the learning of hidden representations of the encoder and decoder together layer by layer,gradually from lowleveltohigh level.


Sub-LinearMemory: HowtoMakePerformersSLiM

Neural Information Processing Systems

Recent works proposed various linear self-attention mechanisms, scaling only asO(L)for serial computation. We conduct a thorough complexity analysis of Performers,aclass which includes most recent linear Transformer mechanisms.