Goto

Collaborating Authors

 fyodor


Unlimiformer: Long-Range Transformers with Unlimited Length Input

Neural Information Processing Systems

Since the proposal of transformers (Vaswani et al., 2017), these models have been limited to bounded input lengths, because of their need to attend to every token in the input. In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single k-nearest-neighbor (kNN) index, while the returned kNN distances are the attention dot-product scores. This kNN index can be kept on either the GPU or CPU memory and queried in sub-linear time; this way, we can index practically unlimited input sequences, while every attention head in every decoder layer retrieves its top-k keys, instead of attending to every key. We evaluate Unlimiformer on several long-document and book-summarization benchmarks, showing that it can process even 500k token-long inputs from the BookSum dataset, without any input truncation at test time. We demonstrate that Unlimiformer improves pretrained models such as BART (Lewis et al., 2020a) and Longformer (Beltagy et al., 2020) by extending them to unlimited inputs without additional learned weights and without modifying their code. Our code and models are publicly available, and support LLaMA-2 as well2.



Unlimiformer: Long-Range Transformers with Unlimited Length Input

arXiv.org Artificial Intelligence

Since the proposal of transformers, these models have been limited to bounded input lengths, because of their need to attend to every token in the input. In this work, we propose Unlimiformer: a general approach that wraps any existing pretrained encoder-decoder transformer, and offloads the cross-attention computation to a single k-nearest-neighbor (kNN) index, while the returned kNN distances are the attention dot-product scores. This kNN index can be kept on either the GPU or CPU memory and queried in sub-linear time; this way, we can index practically unlimited input sequences, while every attention head in every decoder layer retrieves its top-k keys, instead of attending to every key. We evaluate Unlimiformer on several long-document and book-summarization benchmarks, showing that it can process even 500k token-long inputs from the BookSum dataset, without any input truncation at test time. We demonstrate that Unlimiformer improves pretrained models such as BART and Longformer by extending them to unlimited inputs without additional learned weights and without modifying their code. We make our code and models publicly available at https://github.com/abertsch72/unlimiformer .


Russian spacebot put through its paces on Earth before being blasted to the ISS

Daily Mail - Science & tech

Meet Fyodor: Russia's spacebot put through its paces on Earth before being blasted to the ISS - where it will carry out spacewalks before helping build a moon base'Cyber cosmonaut' Fyodor will be sent to the International Space Station Putin wants space chiefs to make first landing on the moon within 15 years A key task for Fyodor will be to'assist in construction and use of bases' 'Cyber cosmonaut' Fyodor will be sent to the International Space Station A key task for Fyodor will be to'assist in construction and use of bases' Take the mind-bending tests that reveal how our brain'makes... Look out for auroras! The world's oldest cancer case: Tumor found in the mouth of... Mercedes reveals smart headlights that can project... Take the mind-bending tests that reveal how our brain'makes... Look out for auroras! The world's oldest cancer case: Tumor found in the mouth of... Mercedes reveals smart headlights that can project... Putin's deputy premier Dmitry Rogozin said: 'This thing can work without a space suit, live not only in a crew vehicle, but even outside it. It can even perform press-ups and drive. Operators will use VR headsets to control its movements from Earth - unless the AI is in control.