Relation-aware Hierarchical Attention Framework for Video Question Answering