Long-range Language Modeling with Self-retrieval