Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection