Reviews: Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
–Neural Information Processing Systems
This paper describes an improved training procedure for visual dialogue models. Rather than maximizing the likelihood of a collection of training captions, this approach first trains a discriminator model to rank captions in a given context by embedding them in a common space, then uses scores from this discriminator as an extra component in a loss function for a generative sequence prediction model. This improved trainin procedure produces modest improvements on an established visual dialogue benchmark over both previous generative approaches as well as adversarial training. I think this is a pretty good paper, though there are a few places in which the presentation could be improved. SPECIFIC COMMENTS The introduction claims that the discriminator "has access to more information than the generator".
Neural Information Processing Systems
Oct-7-2024, 13:08:16 GMT
- Technology: