Visually Grounded Continual Language Learning with Selective Specialization