Funnel-Transformer: FilteringoutSequential RedundancyforEfficientLanguageProcessing

Neural Information Processing Systems 

With the success of language pretraining, it is highly desirable to develop more efficient architectures ofgood scalability thatcanexploit theabundant unlabeled dataatalowercost.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found