Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation

Jun-1-2025, 11:49:49 GMT–Neural Information Processing Systems

Promptable segmentation typically requires instance-specific manual prompts to guide the segmentation of each desired object. To minimize such a need, taskgeneric promptable segmentation has been introduced, which employs a single task-generic prompt to segment various images of different objects in the same task. Current methods use Multimodal Large Language Models (MLLMs) to reason detailed instance-specific prompts from a task-generic prompt for improving segmentation accuracy. The effectiveness of this segmentation heavily depends on the precision of these derived prompts. However, MLLMs often suffer hallucinations during reasoning, resulting in inaccurate prompting. While existing methods focus on eliminating hallucinations to improve a model, we argue that MLLM hallucinations can reveal valuable contextual insights when leveraged correctly, as they represent pre-trained large-scale knowledge beyond individual images.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Jun-1-2025, 11:49:49 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.46)
- Europe > Switzerland
  - Zürich > Zürich (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (0.68)
    - Natural Language > Large Language Model (0.66)
    - Representation & Reasoning (1.00)
    - Vision (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)