IncorporatingBERTinto ParallelSequenceDecodingwithAdapters

Neural Information Processing Systems 

While largescale pre-trained language models such asBERT[5]haveachieved greatsuccess onvariousnatural language understanding tasks,howtoefficiently and effectively incorporate them into sequence-to-sequence models and the corresponding text generation tasks remains a non-trivial problem.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found