Cross-modal Retrieval for Knowledge-based Visual Question Answering