Model Merging to Maintain Language-Only Performance in Developmentally Plausible Multimodal Models