Multi-modal reward for visual relationships-based image captioning

Open in new window