RecurrentMemoryTransformer
–Neural Information Processing Systems
Results ofexperiments showthatRMT performs on par with the Transformer-XL on language modeling for smaller memory sizes and outperforms it for tasks that require longer sequence processing. We show that adding memory tokens to Tr-XL is able to improve its performance.
Neural Information Processing Systems
Feb-8-2026, 17:15:47 GMT
- Country:
- Asia
- Middle East > Qatar
- Russia (0.04)
- Europe
- France (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- South America > Chile
- Asia
- Genre:
- Research Report > New Finding (0.68)
- Technology: