How to Configure Good In-Context Sequence for Visual Question Answering