RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection
–Neural Information Processing Systems
To address this gap, we propose Relational Language-Image Pre-training (RLIP), a strategy for contrastive pre-training that leverages both entity and relation descriptions.
Neural Information Processing Systems
Aug-19-2025, 19:24:12 GMT
- Technology: