MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

Open in new window