Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

Open in new window