An Empirical Study on the Language Modal in Visual Question Answering