Length-Induced Embedding Collapse in Transformer-based Models

Open in new window