Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models Lin Li
–Neural Information Processing Systems
Then, it leverages large language models (LLMs) to generate description-based prompts (or visual cues) for each component.
Neural Information Processing Systems
Oct-9-2025, 02:44:05 GMT