DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization

Messaoud, Safa, Lourentzou, Ismini, Boughoula, Assma, Zehni, Mona, Zhao, Zhizhen, Zhai, Chengxiang, Schwing, Alexander G.

May-13-2021–arXiv.org Artificial Intelligence

The recent growth of web video sharing platforms has increased the demand for systems that can efficiently browse, retrieve and summarize video content. Query-aware multi-video summarization is a promising technique that caters to this demand. In this work, we introduce a novel Query-Aware Hierarchical Pointer Network for Multi-Video Summarization, termed DeepQAMVS, that jointly optimizes multiple criteria: (1) conciseness, (2) representativeness of important query-relevant events and (3) chronological soundness. We design a hierarchical attention model that factorizes over three distributions, each collecting evidence from a different modality, followed by a pointer network that selects frames to include in the summary. DeepQAMVS is trained with reinforcement learning, incorporating rewards that capture representativeness, diversity, query-adaptability and temporal coherence. We achieve state-of-the-art results on the MVS1K dataset, with inference time scaling linearly with the number of input video frames.

deep learning, neural network, summarization, (18 more...)

arXiv.org Artificial Intelligence

May-13-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report > Promising Solution (0.34)

Industry:
- Government > Regional Government (0.46)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning
      - Neural Networks > Deep Learning (0.30)
      - Statistical Learning (0.68)
    - Natural Language (1.00)
    - Representation & Reasoning (1.00)
  - Communications > Social Media (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found