MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers

Open in new window