Accelerating Transformers with Spectrum-Preserving Token Merging

Neural Information Processing Systems 

However, these methods have significant drawbacks, such as sensitivity to token-splitting strategies and damage to informative tokens in later layers.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found