Calibrated and Robust Foundation Models for Vision-Language and Medical Image Tasks Under Distribution Shift

Khan, Behraj, Syed, Tahir Qasim, Durrani, Nouman M., Naseem, Bilal, Ahmad, Shabir, Qureshi, Rizwan

Jul-22-2025–arXiv.org Artificial Intelligence

Foundation models like CLIP and SAM have advanced computer vision and medical imaging via low-shot transfer learning, aiding CADD with limited data. However, their deployment faces two key challenges. \textit{distribution shift} where pre-training and post-training data distributions differ (e.g., due to inter-center image acquisition) and \textit{confidence misalignment}, which leads to overconfident errors. These issues surface differently, vision-language models (e.g., CLIP) suffer from 2D embedding shift (image-text misalignment), while medical models (e.g., SAM) encounter 3D domain shifts (e.g., scanner variation) and voxel-wise calibration need. Existing solutions are domain-specific. We propose \textbf{StaRFM}, a fusion of Fisher information penalty (FIP) and confidence misalignment penalty (CMP) tackling both challenges. It applies FIP, extended to 3D via patch-wise regularization, to reduce embedding shift, and CMP, reformulated for voxel-level predictions, to calibrate segmentation uncertainty. We derive PAC-Bayes bounds. FIP controls generalization via the Fisher-Rao norm, and CMP reduces calibration error via Brier score minimization. StaRFM surpasses baselines by \texttt{+}3.5\% accuracy and 28\% lower ECE on 19 vision datasets (e.g., ImageNet, Office-Home), achieves +4.2\% DSC over SAM-FT and 4.8mm HD95 on medical benchmarks (e.g., BraTS, ATLAS), and reduces cross-domain gaps by up to 20\%. The framework is plug-and-play, requiring minimal architectural changes. Code and models are available at: \href{https://anonymous.4open.science/r/StaRFM-C0CD/}{\textcolor{blue}{\underline{StaRFM}}}

artificial intelligence, calibration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jul-22-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - China > Guangdong Province
    - Shenzhen (0.04)
  - Pakistan > Sindh
    - Karachi Division > Karachi (0.04)
- Europe
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - Switzerland > Zürich
    - Zürich (0.14)
- North America
  - Canada > Quebec
    - Capitale-Nationale Region
      - Quebec City (0.04)
      - Québec (0.04)
  - United States > North Carolina
    - Watauga County > Boone (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.46)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found