Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning

Open in new window