Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries
Fuwen Tan, Paola Cascante-Bonilla, Xiaoxiao Guo, Hui Wu, Song Feng, Vicente Ordonez
–Neural Information Processing Systems
Moreover, our work explores this problem in the context of complex image scenes containing multiple objects.
Neural Information Processing Systems
Oct-2-2025, 15:57:09 GMT