Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answering