Neural Fashion Image Captioning : Accounting for Data Diversity