SMYRF - Efficient Attention using Asymmetric Clustering
–Neural Information Processing Systems
We propose a novel type of balanced clustering algorithm to approximate attention. Attention complexity is reduced from O(N 2) to O(N \log N), where N is the sequence length. Our algorithm, SMYRF, uses Locality Sensitive Hashing (LSH) in a novel way by defining new Asymmetric transformations and an adaptive scheme that produces balanced clusters. The biggest advantage of SMYRF is that it can be used as a drop-in replacement for dense attention layers without any retraining. On the contrary, prior fast attention methods impose constraints (e.g.
Neural Information Processing Systems
Oct-10-2024, 02:51:26 GMT
- Technology: