DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

Open in new window