OrthoRank: Token Selection via Sink Token Orthogonality for Efficient LLM inference

Open in new window