Continual Learning for Encoder-only Language Models via a Discrete Key-Value Bottleneck