Information Entropy Invariance: Enhancing Length Extrapolation in Attention Mechanisms

Open in new window