Federated Attention: A Distributed Paradigm for Collaborative LLM Inference over Edge Networks

Open in new window