On Reinforcement Learningand Distribution Matchingfor Fine-Tuning Language Models withno Catastrophic Forgetting