Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos

Biswas, Dipayan, Shah, Shishir, Subhlok, Jaspal

Jun-18-2025–arXiv.org Artificial Intelligence

We introduce the Lecture Video Visual Objects (LVVO) dataset, a new benchmark for visual object detection in educational video content. The dataset consists of 4,000 frames extracted from 245 lecture videos spanning biology, computer science, and geosciences. A subset of 1,000 frames, referred to as LVVO_1k, has been manually annotated with bounding boxes for four visual categories: Table, Chart-Graph, Photographic-image, and Visual-illustration. Each frame was labeled independently by two annotators, resulting in an inter-annotator F1 score of 83.41%, indicating strong agreement. To ensure high-quality consensus annotations, a third expert reviewed and resolved all cases of disagreement through a conflict resolution process. To expand the dataset, a semi-supervised approach was employed to automatically annotate the remaining 3,000 frames, forming LVVO_3k. The complete dataset offers a valuable resource for developing and evaluating both supervised and semi-supervised methods for visual content detection in educational videos. The LVVO dataset is publicly available to support further research in this domain.

annotation, artificial intelligence, dataset, (13 more...)

arXiv.org Artificial Intelligence

Jun-18-2025

arXiv.org PDF

Add feedback

Genre:
- Instructional Material (0.47)
- Research Report (0.40)

Industry:
- Education > Educational Technology > Audio & Video (0.82)

Technology:
- Information Technology > Artificial Intelligence > Vision (0.62)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found