Efficient In-Memory Acceleration of Sparse Block Diagonal LLMs

Open in new window