ConvBERT: Improving BERT with Span-based Dynamic Convolution

Neural Information Processing Systems 

Specifically, we propose to integrate convolution into self-attention to form a mixed attention mechanism that combines the advantages of the two operations.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found