Self-critical Sequence Training for Image Captioning