Bounding and Filling: A Fast and Flexible Framework for Image Captioning