Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis

Open in new window