Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers

Open in new window