Dual-Stage Value-Guided Inference with Margin-Based Reward Adjustment for Fast and Faithful VLMCaptioning

Open in new window