Choosing Linguistics over Vision to Describe Images