Learning the meanings of function words from grounded language using a visual question answering model