On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code