Patching open-vocabulary models by interpolating weights

Jan-18-2025, 17:39:46 GMT–Neural Information Processing Systems

Open-vocabulary models like CLIP achieve high accuracy across many image classification tasks. However, there are still settings where their zero-shot performance is far from optimal. We study model patching, where the goal is to improve accuracy on specific tasks without degrading accuracy on tasks where performance is already adequate. Towards this goal, we introduce PAINT, a patching method that uses interpolations between the weights of a model before fine-tuning and the weights after fine-tuning on a task to be patched. On nine tasks where zero-shot CLIP performs poorly, PAINT increases accuracy by 15 to 60 percentage points while preserving accuracy on ImageNet within one percentage point of the zero-shot model.

accuracy, increase accuracy, patching open-vocabulary model, (1 more...)

Neural Information Processing Systems

Jan-18-2025, 17:39:46 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.77)
  - Machine Learning (0.42)