Video Question Answering Using CLIP-Guided Visual-Text Attention