Adaptively Aligned Image Captioning via Adaptive Attention Time

Lun Huang, Wenmin Wang, Yaxian Xia, Jie Chen

Neural Information Processing Systems 

AATallowstheframeworktolearn howmany attention steps to take to output a caption word at each decoding step. With AAT, an image region can be mapped to an arbitrary number of caption words while a caption word can also attend to an arbitrary number of image regions. AAT is deterministic and differentiable, and doesn't introduce any noise to the parameter gradients.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found