Sub-LinearMemory: HowtoMakePerformersSLiM

Neural Information Processing Systems 

Recent works proposed various linear self-attention mechanisms, scaling only asO(L)for serial computation. We conduct a thorough complexity analysis of Performers,aclass which includes most recent linear Transformer mechanisms.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found