Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation
–Neural Information Processing Systems
Promptable segmentation typically requires instance-specific manual prompts to guide the segmentation of each desired object. To minimize such a need, taskgeneric promptable segmentation has been introduced, which employs a single task-generic prompt to segment various images of different objects in the same task. Current methods use Multimodal Large Language Models (MLLMs) to reason detailed instance-specific prompts from a task-generic prompt for improving segmentation accuracy. The effectiveness of this segmentation heavily depends on the precision of these derived prompts. However, MLLMs often suffer hallucinations during reasoning, resulting in inaccurate prompting. While existing methods focus on eliminating hallucinations to improve a model, we argue that MLLM hallucinations can reveal valuable contextual insights when leveraged correctly, as they represent pre-trained large-scale knowledge beyond individual images.
Neural Information Processing Systems
Jun-1-2025, 11:49:49 GMT
- Country:
- Asia (0.46)
- Europe > Switzerland
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Technology: