Zero-shot Visual Relation Detection via Composite Visual Cues from Large Language Models ************Supplementary Document*****

Open in new window