Find Everything: A General Vision Language Model Approach to Multi-Object Search

Open in new window