SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency