Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format

Open in new window