Gradual Forgetting: Logarithmic Compression for Extending Transformer Context Windows

Open in new window