Enhancing the Learning Experience: Using Vision-Language Models to Generate Questions for Educational Videos

Stamatakis, Markos, Berger, Joshua, Wartena, Christian, Ewerth, Ralph, Hoppe, Anett

May-6-2025–arXiv.org Artificial Intelligence

Web-based educational videos offer flexible learning opportunities and are becoming increasingly popular. However, improving user engagement and knowledge retention remains a challenge. Automatically generated questions can activate learners and support their knowledge acquisition. Further, they can help teachers and learners assess their understanding. While large language and vision-language models have been employed in various tasks, their application to question generation for educational videos remains underexplored. In this paper, we investigate the capabilities of current vision-language models for generating learning-oriented questions for educational video content. We assess (1) out-of-the-box models' performance; (2) fine-tuning effects on content-specific question generation; (3) the impact of different video modalities on question quality; and (4) in a qualitative study, question relevance, answerability, and difficulty levels of generated questions. Our findings delineate the capabilities of current vision-language models, highlighting the need for fine-tuning and addressing challenges in question diversity and relevance. We identify requirements for future multimodal datasets and outline promising research directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

May-6-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.28)
- Europe > Germany
  - Lower Saxony (0.28)
- Asia > Middle East
  - UAE (0.28)

Genre:
- Instructional Material (0.94)
- Research Report > New Finding (0.48)

Industry:
- Education
  - Educational Technology > Audio & Video (1.00)
  - Educational Setting (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found