The Devil is in the EOS: Sequence Training for Detailed Image Captioning