FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference

Open in new window