SelectiveAttention: EnhancingTransformerthrough PrincipledContextControl
–Neural Information Processing Systems
The attention mechanism within the transformer architecture enables the model to weigh and combine tokens based on their relevance to the query.
Neural Information Processing Systems
Feb-8-2026, 08:15:09 GMT