Exploring Pretraining via Active Forgetting for Improving Cross Lingual Transfer for Decoder Language Models