Memory Augmented Language Models through Mixture of Word Experts