LongHeads: Multi-Head Attention is Secretly a Long Context Processor

Open in new window