UniAttn: Reducing Inference Costs via Softmax Unification for Post-Training LLMs

Open in new window