Transferring Textual Preferences to Vision-Language Understanding through Model Merging

Open in new window