AITopics | video summarization

Collaborating Authors

video summarization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

7f880e3a325b06e3601af1384a653038-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-15-2026, 12:42:43 GMT

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > South Korea > Seoul > Seoul (0.05)

Industry:

Law (0.93)
Information Technology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Detection and Summarization

Neural Information Processing SystemsFeb-15-2026, 12:42:39 GMT

Video highlight detection is a task to automatically select the most engaging moments from a long video.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)

Add feedback

Sequence-to-Segment Networks for Segment Detection

Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras

Neural Information Processing SystemsFeb-12-2026, 21:38:28 GMT

Neural Information Processing Systems http://nips.cc/

conference, proceedings, video, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)

Add feedback

Deep Supervised Summarization: Algorithm and Application to Learning Instructions

Chengguang Xu, Ehsan Elhamifar

Neural Information Processing SystemsFeb-11-2026, 22:43:00 GMT

Neural Information Processing Systems http://nips.cc/

computer vision, ground-truth representative, summarization, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Vision (0.69)
(2 more...)

Add feedback

7503cfacd12053d309b6bed5c89de212-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 09:34:21 GMT

summarization, video, video summarization, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)

Add feedback

Sequence-to-Segment Networks for Segment Detection

Zijun Wei, Boyu Wang, Minh Hoai Nguyen, Jianming Zhang, Zhe Lin, Xiaohui Shen, Radomir Mech, Dimitris Samaras

Neural Information Processing SystemsNov-20-2025, 16:38:59 GMT

It then employs a novel decoding architecture, called Segment Detection Unit (SDU), that integrates the decoder state and encoder hidden states to detect segments sequentially.

computer vision, proceedings, video, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)

Add feedback

Context-Aware Pseudo-Label Scoring for Zero-Shot Video Summarization

Wu, Yuanli, Zhang, Long, Du, Yue, Li, Bin

arXiv.org Artificial IntelligenceOct-23-2025

We propose a rubric-guided, pseudo-labeled, and prompt-driven zero-shot video summarization framework that bridges large language models with structured semantic reasoning. A small subset of human annotations is converted into high-confidence pseudo labels and organized into dataset-adaptive rubrics defining clear evaluation dimensions such as thematic relevance, action detail, and narrative progression. During inference, boundary scenes, including the opening and closing segments, are scored independently based on their own descriptions, while intermediate scenes incorporate concise summaries of adjacent segments to assess narrative continuity and redundancy. This design enables the language model to balance local salience with global coherence without any parameter tuning. Across three benchmarks, the proposed method achieves stable and competitive results, with F1 scores of 57.58 on SumMe, 63.05 on TVSum, and 53.79 on QFVS, surpassing zero-shot baselines by +0.85, +0.84, and +0.37, respectively. These outcomes demonstrate that rubric-guided pseudo labeling combined with contextual prompting effectively stabilizes LLM-based scoring and establishes a general, interpretable, and training-free paradigm for both generic and query-focused video summarization.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.17501

Country: Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.67)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SummDiff: Generative Modeling of Video Summarization with Diffusion

Kim, Kwanseok, Hahm, Jaehoon, Kim, Sumin, Sul, Jinhwan, Kim, Byunghak, Lee, Joonseok

arXiv.org Artificial IntelligenceOct-10-2025

Video summarization is a task of shortening a video by choosing a subset of frames while preserving its essential moments. Despite the innate subjectivity of the task, previous works have deterministically regressed to an averaged frame score over multiple raters, ignoring the inherent subjectivity of what constitutes a "good" summary. W e propose a novel problem formulation by framing video summarization as a conditional generation task, allowing a model to learn the distribution of good summaries and to generate multiple plausible summaries that better reflect varying human perspectives. Adopting diffusion models for the first time in video summarization, our proposed method, Sum-mDiff, dynamically adapts to visual contexts and generates multiple candidate summaries conditioned on the input video. Extensive experiments demonstrate that SummDiff not only achieves the state-of-the-art performance on various benchmarks but also produces summaries that closely align with individual annotator preferences. Moreover, we provide a deeper insight with novel metrics from an analysis of the knapsack, which is an important last step of generating summaries but has been overlooked in evaluation.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.08458

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology: