LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation

Open in new window