FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing

Open in new window