Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model

Zhong, Zihan, Tang, Zhiqiang, He, Tong, Fang, Haoyang, Yuan, Chun

Jan-31-2024–arXiv.org Artificial Intelligence

The Segment Anything Model (SAM) stands as a foundational framework for image segmentation. While it exhibits remarkable zero-shot generalization in typical scenarios, its advantage diminishes when applied to specialized domains like medical imagery and remote sensing. To address this limitation, this paper introduces Conv-LoRA, a simple yet effective parameter-efficient fine-tuning approach. By integrating ultra-lightweight convolutional parameters into Low-Rank Adaptation (LoRA), Conv-LoRA can inject image-related inductive biases into the plain ViT encoder, further reinforcing SAM's local prior assumption. Notably, Conv-LoRA not only preserves SAM's extensive segmentation knowledge but also revives its capacity of learning high-level image semantics, which is constrained by SAM's foreground-background segmentation pretraining. Comprehensive experimentation across diverse benchmarks spanning multiple domains underscores Conv-LoRA's superiority in adapting SAM to real-world semantic segmentation tasks.

arxiv preprint arxiv, dataset, segmentation, (14 more...)

arXiv.org Artificial Intelligence

Jan-31-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Massachusetts (0.04)
  - Canada > Ontario
    - Toronto (0.14)
- Europe
  - Romania > Sud - Muntenia Development Region
    - Giurgiu County > Giurgiu (0.04)
  - Netherlands > North Holland
    - Amsterdam (0.04)
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - France > Grand Est
    - Bas-Rhin > Strasbourg (0.04)
- Asia
  - South Korea > Daejeon
    - Daejeon (0.04)
  - China
    - Guangdong Province > Shenzhen (0.04)
    - Beijing > Beijing (0.04)

Genre:
- Research Report (0.63)

Industry:
- Health & Medicine
  - Diagnostic Medicine > Imaging (0.94)
  - Therapeutic Area > Oncology (0.93)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Natural Language > Large Language Model (1.00)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)