MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

Neural Information Processing Systems 

However, the model size and corresponding computational complexity are constantly scaled up in pursuit of higher performance.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found