Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Open in new window